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SUMMARY 


Problem 

Occupational  task  inventories  have  been  administered  to  hundreds  of  thousands  of  job 
incumbents  in  the  military  service.  The  collected  data  are  used  by  management  for 
several  important  decisions,  including  the  specification  of  occupational  standards,  the 
design  of  training  curricula,  and  the  structuring  of  occupational  specialties.  Surveying 
large  numbers  of  incumbents  places  heavy  time  demands  on  operating  units  and  also 
results  in  high  data  processing  costs.  Thus,  the  problem  is  how  to  minimize  these  costs 
while  selecting  sample  sizes  and  inventory  response  scales  adequate  to  obtain  stable, 
useful  data. 

Objectives 

The  objectives  of  this  study  were  to  evaluate:  (1)  the  stability  and  interrelationship 
of  two  types  of  job  task  scales— the  continuous  Relative  Time-Spent  and  the  dichotomous 
Task-Performed  scales,  (2)  the  stability  of  "job  types"  (i.e.,  clusters  of  job  incumbents) 
derived  from  scale  responses,  and  (3)  the  change  in  stability  when  sample  size  is  reduced. 

Approach 

Scale  stability  was  evaluated  by  comparing  the  profiles  of  average  scale  scores 
between  randomly  split  samples  for  each  pay  grade  of  each  occupation;  and  scale 
interrelationship,  by  comparing  the  profiles  across  scales.  3ob  type  (i.e.,  cluster)  stability 
was  evaluated  by  comparing:  (1)  score  profiles  (between  clusters),  (2)  number  of  tasks 
performed  by  incumbents  in  the  clusters,  and  (3)  the  "fit"  of  individual  incumbent  profiles 
to  the  cluster  profile.  The  change  in  stability  with  reduced  sample  size  was  evaluated 
using  a  "pay-off"  strategy;  that  is,  instead  of  seeking  a  rationale  to  justify  a  requirement 
for  a  particular  level  of  stability,  gains  in  stability  were  tracked  with  increases  in  sample 
size.  Essentially,  if  the  gains  dropped  off— if  the  stability  indices  became  sharply 
asymptotic— there  would  be  little  justification  for  increasing  sample  size  beyond  that 
point. 


The  task  data  analyzed  were  from  four  ratings  representative  of  different  occupa¬ 
tional  areas— Aviation  Machinist's  Mate  (AD),  Electronics  Technician  (ET),  Torpedoman's 
Mate  (TM),  and  Yeoman  (YN). 

Findings 


1.  The  stability  of  both  the  continuous  (Relative  Time-Spent)  and  dichotomous 
(Task-Performed)  scales  was  quite  high  (correlations  in  the  .90s).  When  average  Relative 
Time-Spent  per  task  (i.e.,  on  the  continuous  scale)  was  calculated  on  only  those 
incumbents  actually  performing  a  task  (i.e.,  Relative  Time-Spent  greater  than  zero), 
however,  the  stability  was  very  Tow  (.30s  to  ,30s). 

2.  The  two  types  of  scales  provided  highly  redundant  information,  as  indicated  by 
the  similarity  of  rank  orders  of  tasks  by  Relative  Time-Spent  and  Percent  Performing 
profiles  (correlations  in  mid  .90s). 

3.  The  average  score  on  each  task  by  the  Relative  Time-Spent  (continuous)  scale 
was  generally  very  small,  often  less  than  1  percent  of  the  total  time  spent,  suggesting 
that  members  in  a  pay  grade  spend,  on  the  average,  less  than  1  percent  of  their  time 


performing  any  single  task.  Essentially,  these  time  estimates  are  so  small  because  they 
turn  been  made  proportional  (or  relative)  over  all  tasks  responded  to  in  the  inventory. 
Meaningful  interpretation  of  such  small  values  is  difficult. 

*.  High  scale  stability  was  obtained  for  sample  sizes  substantially  smaller  than 
these  specified  by  management,  in  plotting  the  stability  indices  for  varying  sample  sizes, 
file  curves  became  sharply  asymptotic  (Indicating  limited  improvement)  for  pay  grade 
samples  greater  than  *0  (or  ISO  by  a  more  rigorous  criterion). 

5.  Similarly,  duster  solution  stability  was  achieved  for  occupation  samples  (total  of 
all  pay  grades)  of  1M6,  which  are  substantially  smaller  than  the  samples  of  2000  or 
greater  presently  analyzed. 


Conclusions 

1.  The  dichotomous-type  Task-Performed  scale  yields  stable,  meaningful  task 
information  from  job  incumbent  responses.  No  practical  gain  in  information  is  achieved 
from  the  continuous  Relative  Time-Spent  scale.  More  informative,  more  efficiently 
collected  estimates  of  the  time-spent  per  task  could  probably  be  based  on  incumbents' 
ranking  of  a  small  number  of  the  most  time-consuming  tasks. 

2.  Highly  stable  scale  data  and  duster  solutions  ate  obtainable  from  samples 
substantially  smaller  than  those  previously  administered. 

X  This  study's  empirically  developed  relationship  between  sample  size  and  stability 
can  be  usefully  employed  to  determine  cost-effective  sampling  for  task  inventory  surveys. 
For  example,  for  the  large  occupational  populations  of  Navy  ratings  alone,  use  of  these 
aids  may  reduce  the  time  demands  on  the  fleet  by  about  52,000  work-hours  per  cycle  of 
inventory  administration. 

Recommendations 

It  is  recommended  that: 

1.  The  Relative  Time-Spent  scale  be  deleted  from  task  analysis  inventories. 

2.  Alternative  methods  of  estimating  time  spent  performing  tasks,  including 
ranking  of  the  most  time-consuming  tasks,  be  used  on  a  trial  basis  in  task  inventory 
surveys. 

3.  Responses  to  a  currently  administered  inventory  scale  (see  page  21)  be  used  to 
calculate  the  percentage  of  incumbents  performing  each  task. 

♦.  This  study's  empirically  developed  guidelines  for  sample  size  determination  be 
employed. 
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INTRODUCTION 


Problem 


Job  information  is  collected  by  the  military  services,  on  a  recurring  basis,  by 
administering  task  inventories  (i.e.,  structured  work  analysis  questionnaires)  to  large 
samples  of  job  incumbents.  Since  the  1960s,  more  than  800,000  incumbents  in  the  military 
services  have  completed  inventories  that  often  contain  800  to  1000  items.  The  data 
obtained  aid  management  in  specifying  occupational  standards,  designing  training  cur¬ 
ricula,  and  structuring  occupational  specialties.  Sample  sizes  of  Navy  personnel  admin¬ 
istered  task  inventories  range  from  about  100  to  4000;  sample  sizes  for  ratings  with  large 
populations  (e.g.,  AD— Aviation  Machinist's  Mate)  tend  to  be  about  2500;  and  those  for 
ratings  with  smaller  populations  (e.g.,  TM— Torpedoman's  Mate),  about  500. 1 

Surveying  large  numbers  of  incumbents  to  provide  job  information  results  in  high  data 
acquisition  costs,  including  work  time  losses  to  the  operating  units  and  costs  incurred 
from  large-scale  data  processing.  Thus,  the  problem  is  to  select  task  inventory  response 
scales  and  sample  sizes  that  will  minimize  these  costs  and  yet  yield  stable,  useful  data. 

Background 

In  selecting  useful  scales  and  sample  sizes,  the  type  of  information  to  be  collected 
and  the  analyses  to  be  performed  need  to  be  considered.  One  of  the  most  important  types 
of  task  information  collected  by  the  military  services  is  the  estimate  of  the  percentages 
of  personnel  performing  particular  tasks  or  using  specific  equipment.  This  information  is 
used  to  verify  or  modify  occupational  standards  and  structures  by  determining  the 
similarity  or  dissimilarity  of  tasks  performed  within  different  occupational  specialties. 
Another  important  use  of  the  collected  task  data  is  to  identify  "job  types"  (i.e.,  clusters) 
by  grouping  persons  performing  jobs  with  similar  task  requirements.  The  identification  of 
these  clusters,  for  example,  can  make  substantial  contributions  to  training  cost-effective¬ 
ness  by  tailoring  training  courses  to  the  specific  types  of  jobs,  thereby  providing  an 
objective  basis  for  determining  the  numbers  of  students  for  these  courses  and  the  content 
of  the  curriculum. 

The  Navy  collects  task  data  by  having  job  incumbents  indicate  the  relevance  of  each 
task  in  the  inventory  booklet  to  their  particular  job  by  responding  to  the  Relative  Time- 
Spent  scale.  This  scale  is  a  five-point  Likert-type  scale  of  time  spent  on  a  task,  with 
points  ranging  from  "very  much"  (a  score  of  5)  to  "very  little"  (a  score  of  1).  The  Relative 
Time-Spent  scale  responses  are  converted  into  scores  on  two  additional  scales~a  Relative 
Time-Spent  Percentage  scale  and  a  dichotomous  Task- Performed  scale.  The  former  is 
simply  a  conversion  of  the  Relative  Time-Spent  responses  to  percentages  that  sum  to  100 
percent  for  all  tasks  performed  by  an  individual  (see  Appendix  A  for  conversion 
procedure).  The  Task-Performed  scale  is  a  two-point  scale  indicating  whether  an 
incumbent  performs  or  does  not  perform  a  task;  that  is,  the  scale  score  of  1  indicates  the 
task  is  performed;  and  a  0,  that  is  is  not.  The  Task-Performed  scores  of  1  are  derived 
from  any  response  on  the  Relative  Time-Spent  scale,  while  the  scores  of  0  are  derived 
from  any  non-response  (i.e.,  blank)  on  the  Relative  Time-Spent  scale.  Ail  references  to 
Relative  Time-Spent  estimates  or  scores  in  the  following  text  will  refer  to  the  converted 
scores  (i.e.,  percentages). 


*In  the  Navy,  the  term  "rating"  indicates  a  basic  enlisted  occupation  (e.g.,  ST--Sonar 
Technician)  and  "service  rating"  identifies  a  major  class  of  equipment  or  systems  worked 
on  within  a  rating  (e.g.,  STS--Sonar  Technician  (Submarine)).  Navy  Enlisted  Classification 
Code  (NEC)  indicates  a  more  specialized  skill  within  or  across  ratings. 
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Programs  from  the  Comprehensive  Occupational  Data  Analysis  Programs  (CODAP) 
(Weissmuller,  Barton,  Sc  Rogers,  1974)  are  applied  to  the  scale  data  to  derive  the  following 
job  description  profiles: 

1.  MP— Percent  of  Members  Performing  (each  task). 

2.  TSM— Average  Percent  Time-Spent  by  All  Members. 

3.  TSMP— Average  Percent  Time-Spent  by  Members  Performing  (each  task). 

The  scales,  job  description  profiles,  and  the  CODAP  hierarchical  clustering  procedure 
used  to  group  persons  performing  similar  work  are  described  in  Appendix  A. 

Standard  formulae  to  determine  sample  size  requirements  for  collecting  survey  data 
have  been  available  for  some  time  (Cochran,  1953;  Parten,  1950).  These  procedures, 
however,  require  an  estimate  of  the  population  variance  (often  not  easily  estimated),  and 
sampling  assumptions  that  are  not  easily  met  by  operational  surveys.  Also,  they  are 
limited  in  that  they  are  not  appropriate  for  estimating  multivariate  population  parameters 
(e.g.,  scale  response  rates  for  more  than  one  task  in  an  inventory,  or  characteristics  of 
multivariate  cluster  solutions— however,  see  Frankel,  1971;  Moonan,  1954;  and  Wolfe, 
1970).  Because  of  these  requirements  and  the  limitation,  and  because  the  specific 
characteristics  or  properties  of  data  do  affect  the  results  of  analyses,  the  present  study 
analyzed  the  stability  of  samples  of  real  data. 

Purpose 

The  purpose  was  to  determine  empirically  the  relationship  of  sample  size  to  stability 
of  incumbents'  inventory  scale  responses.  Questions  specifically  addressed  were: 

1.  What  is  the  stability  and  interrelationship  of  two  kinds  of  occupational  task 
inventory  scales— Relative  Time-Spent  and  Task-Performed? 

2.  What  is  the  stability  of  cluster  solutions  that  use,  as  input  data,  scale  responses 
by  individual  job  incumbents? 

3.  What  changes  occur  in  stability  indices  when  sample  sizes  are  reduced? 


METHOD 


Data 


Relative  Time-Spent  data  for  inventory  tasks  were  provided  by  the  Navy  Occupa¬ 
tional  Development  and  Analysis  Center  (NODAC).  The  data  were  from  four  ratings 
representative  of  different  occupational  areas— Aviation  Machinist's  Mate  (AD), 
Electronics  Technician  (ET),  Torpedoman's  Mate  (TM),  and  Yeoman  (YN)  (see  Table  1). 
The  data  had  been  collected  from  a  variety  of  Fleet  and  Shore  activities,  although 
instructor  and  student  billets  were  not  sampled.  Each  of  the  four  rating  samples 
contained  data  from  eight  different  pay  grades,  E-2  through  E-9.  These  data  comprised 
the  populations  (referred  to  as  "Total  Sample")  from  which  samples  were  drawn  for 
analysis.  Task  data  for  entire  rating  populations  do  not  exist.  Thus,  findings  based  on 
NODAC  samples  provide  the  best  available  guidelines  for  sample  sizes  required  for  rating 
populations.  Appendix  B  presents  sample  and  population  sizes  by  pay  grade  (Table  B-l), 
and  the  types  of  units  sampled  for  the  AD  and  ET  ratings  (Table  B-2). 
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Table  1 


Task  Inventory  Sizes  for  Four  Navy  Ratings 


Rating 

Title 

Abbreviation 

Inventory  Size3 
Total 

Items  Tasks 

Administration 

Date 

Aviation  Machinist's  Mate 

AD 

1163 

404 

August 

1974 

Electronics  Technician 

ET 

1080 

597 

June 

1975 

Torpedoman's  Mate 

TM 

782 

337 

March 

1975 

Yeoman 

YN 

810 

529 

August 

1975 

aThe  task  item  section  (i.e.,  statements  of  work  performed)  was  analyzed.  Other 
inventory  sections  include  biographical,  job  satisfaction,  and  equipment  items. 


Sampling  Procedure 

A  set  of  samples  was  created  from  the  total  sample  for  these  ratings  (see  Table  2), 
using  a  systematic  random  sampling  procedure  described  by  Kish  (1965).  Sampling  was 
performed  within  each  pay  grade  of  each  rating  or  service  rating  to  assure  a  similar 
proportion  within  pay  grade  across  samples  (because  of  pay  grade  importance  in  deter¬ 
mining  occupational  requirements  and  in  other  management  decisions).  Pairs  of  indepen¬ 
dent  samples  (i.e.,  no  individual  was  included  in  both  samples  of  the  pair)  were  created  by 
randomly  splitting  one  of  the  next  larger  samples,  rather  than  sampling  from  two  diffe¬ 
rent  larger  samples.  For  example,  in  Table  2,  the  two  N  =  250  independent  AD 
samples  were  both  drawn  from  one  of  the  N  =  500  samples.  Hereafter,  the  samples  will 
be  referred  to  by  the  rating  (or  service  rating)  abbreviation  and  sample  size  (e.g.,  the 
AD250  samples,  TM368  samples,  ETN504  samples).  A  and  B  denote  any  two  equal-sized, 
independent  rating  samples.  Table  2  also  shows  the  holdout  groups  that  were  drawn  for  a 
specific  analysis. 

To  determine  the  stability  of  scores  on  the  Relative  Time-Spent  and  Task-Performed 
scales,  job  description  profiles  were  derived  by  pay  grade  for  the  following  pairs  (A,  B)  of 
samples: 


AD1269  ET1275  YN1386  TM368 

AD500  El  500  YN500 

AD250  ET250  YN250 

Profiles  were  also  derived  for  four  pairs  of  service  rating  samples. 

Stability  indices  (described  below)  were  calculated  to  measure  the  similarity  of 
profiles  across  the  A  and  B  samples.  Ptofiles  were  compared  (across  the  pair)  at  the  same 
pay  grade  level  (e.g.,  E-4  in  both  A  and  B  samples)  as  well  as  at  different  pay  grade  levels 
(e.g.,  E-5  in  one  sample  versus  E-7  in  the  other  paired  sample).  The  similarity  of  principal 


Table  2 


Sample  Sizes 


N  and  % 

RATING 

AD 

ST 

YN 

TM 

Population  N, 

14296 

9050 

9847 

2513 

Total  Sample  N 

2538 

2546 

2771 

735 

ZP 

17.9 

28.1 

28.1 

29.2 

V  Samples 
'Drawn 


2 

N 

1269 

1275 

1386 

368 

ZA 

50.0 

50.0 

50.0 

50.0 

ZP 

8.8 

14.1 

14.1 

14.6 

1 

N 

2000 

2000 

2000 

ZA 

78.8 

78.5 

72.2 

ZP 

13.9 

22.0 

20.3 

2 

N 

1000 

1000 

1000 

ZA 

39.4 

39.2 

36.1 

ZP 

6.9 

11.0 

10.1 

2 

N 

ZA 

500 

500 

500 

19.7 

19.6 

18.0 

ZP 

3.4 

5.5 

5.0 

2 

N 

250 

250 

250 

ZA 

9.8 

9.8 

9.0 

ZP 

1.7 

2.7 

2.5 

2  N 

ZA 

ZP 


2  M 

ZA 
ZP 


Service  Rating8 


ADJ 

ETR 

976 

366 

38.5 

14.4 

13.6 

8.1 

ADR 

ETN 

238 

504 

9.3 

9.0 

3. '3 

5.0 

Holdout  Group 
AD 

N  540 
*A  21.3 
E  3.8 


Notes. 


YN 

774 

27.9 

7.7 
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interest  was  the  comparison  of  profiles  at  the  same  pay  grade  level— the  higher  this 
similarity,  the  greater  the  stability  of  the  average  scale  scores. 

Scale  Stability  Indices 

The  following  indices  were  calculated  on  the  job  description  profiles  across  the  A  and 
B  samples  at  each  pay  grade  level. 

1.  To  measure  the  stability  of  relative  values  (essentially,  the  rank  order)  of  profile 
scores  for  tasks,  the  Product  Moment  correlation  coefficient  was  calculated  and  labeled 


a.  £^p»  when  calculated  on  the  Percent  of  Members  Performing  (MP)  profile. 

b.  rTSM,  when  calculated  on  the  Average  Percent  Time-Spent  by  All  Members 
(TSM)  profile. 

c.  Ljsmp>  when  calculated  on  the  Average  Percent  Time-Spent  by  only  those 
Members  Performing  (TSMP)  profile. 

2.  To  measure  the  stability  of  the  absolute  (i.e.,  actual  percentage)  scores  for 
tasks,  the  difference  in  percentages  of  members  performing  tasks  was  calculated  and 
labeled  as: 


a.  Z-Difference,  when  indicating  the  proportion  of  inventory  tasks  not  obtain¬ 
ing  a  significant  percentage  difference  (p  >  .05,  by  Z-test,  Walker  &  Lev,  1969,  p.  188) 
(e.g.,  a  proportion  of  .90  indicates  that  9/10ths  of  the  tasks  in  an  inventory  were  not 
significantly  different). 

b.  Percent  Difference,  when  indicating  the  proportion  of  inventory  tasks  that 
did  not  differ  by  more  than  05,  10,  15,  and  20  percentage  points  (i.e.,  as  with  Z- 
Difference,  a  large  proportion  equals  high  stability). 

Graphic  Relationship  Between  Pay  Grade  Sample  Size  and  Stability 

Because  the  scores  on  the  MP  profile  proved  to  be  highly  stable  (see  RESULTS,  p.  8) 
and  apparently  more  meaningful  (see  p.  10)  than  scores  on  the  Average  Percent  Time 
Spent  profiles,  the  plots  to  be  described  were  constructed  only  for  the  MP  profile  data. 
Further,  while  the  Z-Difference  values  (described  above)  were  calculated,  emphasizing 
their  use  might  be  misleading  since  lack  of  significance  for  percentage  differences  based 
on  small  ns  could  lead  to  an  erroneous  conclusion  of  adequate  stability.  Thus,  only  the 
values  for  the  r^p  and  the  Percent  Difference  Stability  indices  were  plotted  against  the 

pay  grade  sample  sizes  contained  in  the  following  pairs  of  samples  (i.e.,  A  and  B  samples) 
listed  in  Table  2:  AD1269,  ET1275,  YN1386,  and  TM368.  Only  the  n.p  and  Percent 


2 The  calculation  treated  tasks  of  job  description  profiles  as  cases,  and  task 
percentages  as  scores.  Pairs  of  zero  scores  on  corresponding  tasks  of  two  profiles  were 
deleted  from  the  calculation.  With  this  correlational  model,  complete  independence  of 
scores  did  not  exist;  that  is,  the  same  individuals  provided  responses  for  calculation  of  a 
percentage  (i.e.,  score)  for  more  than  one  task.  Cragun  and  McCormick  (1967),  however, 
report  only  minor  inflation  for  correlation  coefficients  derived  with  this  same  model  for 
the  study  of  U.S.  Air  Force  task  analysis  inventory  reliability. 
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Difference  values  that  were  calculated  for  corresponding  pay  grades  (e.g.,  E-3  for  sample 
A  compared  to  E-3  for  sample  B)  were  plotted.  The  plotted  stability  values  (proportions 
of  rs)  can  range  from  zero  (no  stability)  to  1.0  (maximum  stability). 


A  computerized,  cubic-spline,  curve-smoothing  procedure  was  applied  to  the  plotted 
data  points.  This  procedure  was  deemed  to  be  more  appropriate  than  curve  smoothing  (or 
fitting)  by  means  of  linear  regression  because  of  the  curvilinear  (asymptotic)  nature 
observed  in  the  data.  The  spline  curve  procedure  generates  the  smoothest  possible  curve 
that  passes,  on  the  average,  within  a  specified  distance  of  the  data  points  (ISSC,  1973,  pp. 
11-7  to  11-9). 


Since  the  relationship  between  the  Percent  Difference  values  and  sample  size  of  pay 
grade  appeared  to  be  curvilinear,  the  eta  coefficient,  as  opposed  to  the  linear  correlation 
coefficient,  was  calculated  between  these  two  variables  (see  formulae  in  Dunnette,  1966, 
pp.  146-148). 

Relationship  Between  Task-Performed  and  Relative  Time-Spent  Scales 

Preliminary  observation  indicated  little  informational  difference  between  the  Per¬ 
cent  of  Members  Performing  (MP)  profile  (derived  from  the  Task-Performed  scores)  and 
the  Average  Percent  Time-Spent  by  All  Members  (TSM)  profile  (derived  from  the  Relative 
Time-Spent  scores).  Thus,  if  the  information  obtained  from  the  two  scales  is  highly 
redundant,  the  time  demands  on  the  job  incumbent  could  be  reduced  by  administering  a 
two-point  Task-Performed  scale.  To  confirm  empirically  this  preliminary  observation, 
correlations  between  these  profiles  were  calculated  (using  the  same  model  described  in 
Footnote  2)  within  each  of  the  eight  pay  grades,  E-2  through  E-9,  using  one  of  the  ADI 269 
and  one  of  the  TM368  samples. 


Procedures  to  Determine  Cluster  Solution  Stability 

Employing  the  CODAP  (IBM  360  version)  hierarchical  cluster  procedure  (see  Appendix 
A),  24  separate  cluster  analyses  were  performed  on  the  following  samples: 


AD2000 

AD1000  (A  6c  B) 
AD500  (A  6c  B) 
AD250  (A  6c  B) 


ET2000 

ET 1000  (A  6c  B) 
ET500  (A  6c  B) 
ET250  (A  <5c  B) 


YN2000 

YN1000  (A  &  B) 
YN500  (A  6c  B) 
YN 250  (A  6c  B) 


TM735 

TM368  (A  6c  B) 


These  analyses  resulted  in  24  hierarchical  cluster  solutions.  Since  2000  is  the  maximum 
number  of  cases  that  can  be  cluster-analyzed  by  the  IBM  version,  these  sized  samples  (and 
the  TM735  sample)  are  the  "total  samples"  for  this  part  of  the  method. 

Selection  of  Clusters 


Since  a  hierarchical  cluster  solution  consists  of  a  set  of  overlapping  clusters  (i.e., 
smaller  clusters  are  contained  in  larger  clusters),  criteria  to  select  the  sets  of  nonoverlap¬ 
ping  (mutually  exclusive)  clusters  on  which  to  evaluate  stability  had  to  be  specified. 
These  criteria  were: 


1.  Cluster  size— At  least  one  percent  of  the  sample  and  as  large  as  possible  while 
still  meeting  the  following  criteria. 

2.  Mutually  exclusive  cluster  membership)— No  individual  in  more  than  one  selected 
cluster. 


3.  Cluster  homogeneity  (by  CODAP-generated  homogeneity  index,  Overlap  Be¬ 
tween)— At  least  35  percent.  (A  second  CODAP-generated  homogeneity  index,  Overlap 
Within,  was  not  part  of  the  selection  criteria,  since  it  is  strongly  influenced  by  the 
membership  N  of  the  cluster.) 

Cluster  Matching  Procedure 

Selected  clusters  were  systematically  matched  across  the  A  and  B  pairs  of  in¬ 
dependent  samples  from  which  they  were  derived.  The  rationale  and  a  detailed 
description  of  the  matching  procedure  are  provided  in  Appendix  C.  In  general,  the 
procedure  matched  the  two  clusters  that  were  most  similar  to  the  same  cluster  derived 
from  the  rating  total  sample  (i.e.,  from  the  AD2000,  ET2000,  YN2000,  and  TM735  samples 
in  Table  2).  Thus,  any  two  matched  clusters  were  counterparts  of  a  single  cluster  from 
the  total  sample.  The  term  ’’matched"  refers  only  to  clusters  determined  to  be  related 
across  the  independent  A  and  B  samples.  The  term  "corresponding"  associates  any  total 
sample  cluster  with  its  A  and  B  counterparts. 

Cluster  Stability  Indices 

Job  description  profiles  were  calculated  for  each  of  the  selected  clusters. 
Comparisons  between  cluster  profiles  were  made  by  using  the  three  indices  of  profile 
similarity  described  below.  The  first  two  indices  are  essentially  "cluster- to-cluster" 
profile  comparisons.  That  is,  the  indices  are  calculated  between  two  clusters  on  the  same 
type  of  profile  (i.e.,  MP,  TSM,  or  TSMP  profile).  High  profile  similarity  between 
corresponding  clusters  indicates  high  stability.  By  contrast,  the  third  index  compared  the 
profile  data  of  individuals  from  an  independent,  holdout  sample  to  the  profiles  calculated 
for  the  selected  clusters.  This  index  was  used  in  an  assignment  procedure  to  determine  if 
the  same  individuals  would  be  assigned  to  each  of  the  clusters  in  a  matched  pair.  The 
same  individuals  will  be  assigned  to  each  of  the  clusters  if  the  matched  clusters  are 
stable. 


1.  Product-moment  correlation  coefficient.  This  index  was  calculated  on  the  three 
types  of  job  description  profiles  (MP,  TSM,  TSMP),  between  the  clusters  matched  across 
the  A  and  B  pairs  of  samples.  Thus,  each  r^  indicated  the  stability  of  one  of  the  three 

types  of  job  description  profiles  for  a  pair  of  matched  clusters  derived  from  the  A  and  B 
samples  of  1000,  500,  368,  or  250.  The  index  is  labeled  as  either  MP  ir^B,  TSM  r^B,  or 

TSMP  rAB,  depending  on  the  profile  being  compared.  The  average  of  obtained  index 

values  for  the  set  of  matched  clusters  derived  from  one  pair  of  A  and  B  samples  was  also 
calculated.  (See  footnote  2  for  the  correlational  model  applied.) 

The  correlation  coefficient  was  also  calculated  between  the  MP  profile  of  the 
toted  (T)  sample  clusters  and  the  MP  profile  for  the  clusters  matched  across  the  A  and  B 
pair  of  samples.  Thus,  each  r^.^  and  r^  indicated  the  stability  of  the  MP  profile  of  a 

cluster  from  one  of  the  1000,  500,  368,  and  250  paired  samples  when  compared  with  the 
profile  of  a  corresponding  total  sample  cluster.  The  average  of  and  r^  values  was 

also  calculated  for  all  clusters  compared  for  each  reduced  sample. 

2.  Number  of  tasks  performed.  This  tally  was  the  number  of  tasks  in  the  MP  profile 
with  a  percentage  value  greater  than  0  for  one  or  both  matched  clusters  being  compared. 
A  decrease  in  this  tally,  as  sample  size  is  reduced,  would  indicate  a  loss  of  task 
information  (i.e.,  more  tasks  were  obtaining  zero  scores  for  both  matched  clusters).  The 
average  of  these  tallies  (labeled  Av.  N  Tasks)  was  calculated  for  the  set  of  matched 
clusters  for  each  of  the  1000,  500,  368,  and  250  sample  pairs. 


3.  Percent  of  Common  Membership  in  Matched  Clusters.  This  index  was  calculated 


on  the  MP  and  TSM  profiles.  The  procedures  for  calculating  this  index  are  described  in 
more  detail  in  Appendix  C.  Comparing  each  individual's  profile  with  each  cluster  profile, 
the  individual  was  "assigned"  to  the  cluster  with  the  best  fit--first  to  the  cluster  from  the 
A  sample,  then  to  the  best  fit  cluster  from  the  B  sample.  Next,  the  individual's 
assignment  to  both  of  the  matched  clusters  (across  the  A  and  B  samples)  was  checked,  and 
a  Percentage  of  Common  Membership  was  calculated  on  the  common  vs.  total  member¬ 
ship  for  the  two  matched  clusters.  (Because  of  the  extensive  calculations  required  to 
compare  hundreds  of  individual  profiles  with  each  of  several  clusters,  this  part  of  the 
analysis  was  performed  only  on  the  A  and  B  ADI  000  and  YN1000  samples.)  The  total  or 
overall  Percentage  of  Common  Membership  was  also  calculated  for  each  set  of  AD  and 
YN  matched  clusters  (i.e.,  the  percentage  was  calculated  on  the  common  vs.  total 
membership  that  was  summed  over  all  matched  clusters  in  each  set). 

Relationship  Between  Cluster  Membership  Size  and  Stability 


While  the  above  analyses  determined  the  effect  of  the  size  of  the  total  sample  on 
deriving  a  set  of  stable  clusters,  this  next  analysis  related  the  stability  of  single  clusters 
to  the  number  of  members  within  each  cluster.  The  value  of  the  stability  index,  r^B,  for 

the  MP  profile  (see  "1"  above)  was  plotted  against  the  average  membership  N  for  the  two 
matched  clusters  (of  the  A  and  B  samples  on  which  the  r^B  was  calculated).  These  plots 

were  constructed  for  matched  clusters  derived  from  the  pairs  of  the  AD1000,  ET1000, 
YN1000,  and  TM368  samples.  The  curve-smoothing  procedure  (see  page  5)  was  also 
applied  to  these  data.  It  should  be  noted  that  the  MP  r^B  index,  as  well  as  the  other 

correlational  indices  described  above,  measure  the  stability  of  the  relative  values  of  the 
percentages  on  cluster  profiles. 


RESULTS 


Comparison  Among  Scales 


Stability  of  Scales 

As  shown  in  Table  3,  the  stability  of  the  Average  Percent  Time-Spent  by  All  Members 
(TSM)  profile  and  the  Percent  of  Members  Performing  (MP)  profile  was  found  to  be  very 
high;  and  that  of  the  Average  Percent  Time-Spent  by  (only)  Members  Performing  (TSMP) 
profile,  relatively  low.  For  example,  the  mean  correlation  coefficients  for  pay  grades  E-2 
through  E-9  were  mostly  .90  or  above--never  below  .80--for  all  ratings  for  both  the  TSM 
and  MP  profiles,  while  those  for  the  TSMP  profiles  were  .28,  A3,  .36,  and  .28  for  the  four 
ratings.  Furthermore,  for  12  of  32  comparisons  (i.e.,  eight  pay  grades  for  four  ratings)  at 
the  same  pay  grade  level  (e.g.,  E-3  for  the  A  and  B  samples),  the  Ij^^p  coefficient  was 

not  even  significantly  different  from  zero,  indicating  no  similarity  or  stability  between 
profiles. 

Examination  of  the  coefficients  revealed  that  22  of  the  32  comparisons  at  the 

same  pay  grade  level  (i.e.,  for  the  E-2  through  E-9  comparisons  for  the  ADI 269,  ET1275, 
YN1386,  and  TM368  samples)  yielded  lower  values  than  some  different  level  pay  grade 
comparisons  (e.g.,  E-3  with  E-5),  On  the  other  hand,  r^p  and  ITSM  values  were  generally 

much  higher  for  the  same  level  than  for  different  level  pay  grades,  and  systematically 
decreased  as  pay  grade  disparity  increased  (see  Appendix  D— Table  D-l  for  intercorrela¬ 
tions  for  the  ADI  269  samples.  Results  for  three  other  ratings  are  available  on  request 
from  the  Navy  Personnel  Research  and  Development  Center,  Code  310). 
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Similar  correlational  results  were  obtained  for  the  four  service  ratings  (i.e.,  for  the  A 
and  B  samples  ADJ976,  ADR238,  ETR366,  and  ETN504  listed  in  Table  2).  That  is,  very 
high  r^p  and  r^^  values  (e.g.,  for  comparisons  at  the  same  pay  grade  level  for  ETR  and 

ETN,  respectively,  mean  r^p  =  .89  and  .94),  and  very  low  values  (e.g.,  ETR  and 

ETN  mean  IjjMp  =  *27  and  .22)  were  obtained.  (Service  rating  results  are  available  on 

request.) 

Relationship  Between  Scales 

Little  informational  difference  (i.e.,  little  independence)  was  found  between  the 
relative  values  of  the  MP  and  TSM  profiles.  As  Table  4  shows,  correlations  between  these 
profiles  were  in  the  mid  .90s  (except  for  pay  grade  E-9). 


Table  4 

Correlation  Between  Profiles  of  the  Percent  of  Members 
Performing  (MP)  and  the  Average  Percent  Time-Spent  by 
Ail  Members  (TSM) 


Rating 

Sample 

E-2 

E-3 

E-4 

Pay  Grade 

E-5 

E-6 

E-7 

E-8 

E-9 

AD1269  r 

.94 

.96 

.97 

.97 

.93 

.96 

.96 

.72 

N 

67 

149 

282 

337 

281 

108 

31 

14 

TM368  r 

.92 

.94 

.96 

.96 

.94 

.90 

.92 

.83 

N3 

29 

66 

125 

92 

36 

10 

02 

N  is  the  number  of  persons  within  pay  grade  sample. 


Meaningfulness  of  Average  Scale  Scores 

The  magnitudes  of  the  TSM  percentages  (i.e.,  average  scores  on  the  Relative  Time- 
Spent  Percentage  Scale)  for  the  pay  grades  of  all  ratings  analyzed  were  generally  found  to 
be  substantially  below  1  percent  and  very  often  below  .1  percent.  This  finding  suggests 
that  all  members  in  the  pay  grade  spend,  on  the  average,  much  less  than  1  percent  of  their 
time  performing  any  single  task. 

Appendix  E  (pages  E-l  through  E-8)  contains  the  average  scale  scores  (i.e.,  percen¬ 
tages)  for  portions  of  the  three  job  description  profiles  for  YN  pay  grade  E-5  and  for  TM 
pay  grade  E-7.  The  displayed  scores  were  ordered  by  the  TSM  scores,  from  the  highest 
value  in  the  entire  profile  to  the  lowest  (although  the  lowest  value  is  not  shown  due  to 
space  limitation).  The  percent  Time-Spent  value  is  above  1  percent  for  only  18  of  337 
tasks  in  the  TM  task  inventory  (page  E-l),  and  less  than  .1  percent  for  more  than  100  of 
the  337  TM  inventory  tasks.  Very  small  values  (i.e.,  about  1  percent)  are  also  typically 
obtained  for  the  TSMP  profile.  Tasks  performed  for  only  minute  fractions  of  the  job 
incumbent's  time  tend  to  yield  information  of  little  use  for  decisions  regarding  the 
structuring  or  staffing  of  billets.  On  the  other  hand,  values  for  the  MP  profile  (see  pages 


E-l  through  E-8)  appear  meaningful  and  useful.  For  example,  the  values  displayed  for  TM 
E-7  (Table  E-l)  range  from  about  2  percent  to  about  82  percent  with  substantial 
percentages  of  personnel  performing  many  of  the  inventory  tasks. 

Stability  for  Varying  Sample  Sizes 

The  expected  relationship  was  found  between  all  of  the  stability  indices  and  pay 
grade  sample  size  (see  Appendix  F,  Tables  F-l  through  F-4).  As  pay  grade  sample  size 
increased,  the  stability  increased  (e.g.,  as  sample  size  for  the  YN  pay  grades  increases 
from  10  to  340,  the  obtained  r^p  value  increases  from  .74  to  .99- -see  Table  F-4). 

Figures  1  and  2  display  the  plots  between  stability  and  sample  size.  (The  derivation 
of  the  plot  axes  and  the  curve  smoothing  procedure  are  described  on  pages  5  and  6). 
Curve  1  in  Figure  1,  which  plots  sample  size  against  the  10  Percent  Difference  index, 
indicates  that  extremely  high  stability  was  attained  when  sample  size  within  pay  grade 
reached  about  100;  and  high  stability,  when  the  size  reached  about  30.  Curve  2,  which 
plots  the  more  rigorous  05  Percent  Difference  values  (see  page  5)  indicates  that  very  high 
stability  was  attained  when  sample  size  reached  about  240;  and  moderate  stability,  when 
the  size  reached  about  100.  Generally,  the  improvement  in  stability  begins  to  drop  rapidly 
for  increases  in  pay  grade  size  above  40  in  Curve  1,  and  for  increases  above  140  for  Curve 
2. 


The  curve  in  Figure  2,  which  plots  the  r^p  values,  is  highly  similar  to  Curve  1  in 

Figure  1.  Both  display  high  stability  when  sample  size  exceeds  about  30  and  extremely 
high  stability  for  samples  above  100.  Also,  both  curves  are  clearly  asymptotic  and  show 
minimum  improvement  in  stability  for  increases  above  40. 

If  we  compare  the  curves  in  Figure  1  for  a  sample  of  100,  we  find  that  an  increase  of 
50  percent,  to  150,  would  raise  stability  in  Curve  2,  which  plots  by  the  more  rigorous 
criterion,  from  .75  to  .83,  but  that  it  would  produce  hardly  any  gain  by  Curve  1 --already 
at  .97.  If  we  compare  Figures  1  and  2  for  a  sample  of  80,  we  find  that  Curve  2  in  Figure  1 
indicates  a  stability  index  of  only  .70,  but  the  curve  in  Figure  2,  an  index  of  .95.  Table  5, 
which  presents  corresponding  points  on  all  of  the  curves  for  selected  sample  sizes, 
indicates  that  sampling  above  size  240  would  produce  very  little  gain,  even  in  terms  of  the 
most  rigorous  stability  criterion.  Further,  if  only  the  rank  order  of  percentages  of 
members  performing  tasks  is  required,  a  sample  size  of  100  or  even  40  would  be 
acceptable  (r^p  =  .97  or  .90). 

The  eta  coefficients  (q),  calculated  between  sample  size  and  each  of  the  stability 
indices  of  Figure  1  (see  Dunnette,  1966)  were  quite  high— n  =  .76  for  Curve  1  and  .88  for 
Curve  2  (j>  <  .01,  df  =  5,  26) — indicating  a  significant  consistency  for  pay  grades  of 
different  occupational  areas.3 

Stability  of  Clusters 

As  shown  in  Table  6,  the  number  of  clusters  selected  from  each  of  the  24  obtained 
solutions  ranged  from  10  to  16  for  the  largest  samples,  and  from  13  to  17  clusters  in  the 
smallest  samples.  Also,  for  all  ratings  except  YN,  the  percentage  of  personnel  from  each 


3  For  the  calculation  of  both  coefficients,  six  intervals  were  constructed  for  the 
independent  variable  (i.e.,  sample  size),  thus  assuring  at  least  three  observations  per 
interval  (see  Lewis,  1960,  pp.  120-122).  For  significance  test  of  eta,  see  Hays,  1963, 
Formula  16.6.4. 


11 


sample  who  were  included  within  selected  clusters  by  rating  is  similar  (e.g.,  for  the  AD 
rating,  the  percentages  ranged  from  66.6  for  the  sample  of  1000,  to  74.4  for  the  sample  of 
250,  compared  with  38.2  (N  =  1000)  to  55.8  (N  =  250)  for  YN.  Except  for  TM,  very  similar 
numbers  of  clusters  were  selected  for  solutions  based  on  the  total  sample  (i.e.,  the 
AD2000,  ET2000,  and  YN2000  samples)  compared  with  the  numbers  of  clusters  selected 
from  the  size  1000  samples. 


Table  5 

Comparison  of  Three  Stability  Indices  for 
Selected  Sample  Sizes 


Stability  Index 


Sample  Size 
Within  Pay  Grade 


Proportions  of  Inventory  Rank  Order  of 

Tasks  with  Percent  of  Tasks  by  Values 

Members  Performing  Within:  of  Percent  of 

Members  Performing 

5%  Diff.  10%  Diff. 

(more  rigorous  (less  rigorous 

criterion)  criterion)  r^p 


40 

.58 

100 

.75 

240 

.91 

340 

.96 

440 

.99 

.87 

.90 

.97 

.97 

1.0 

.99 

1.0 

.99 

1.0 

.99 

Table  6 


Number  of  Clusters  Selected  from  Total  Cluster  Samples 
and  Pairs  of  Reduced  Samples 


Rating  Cluster  Solution 

Na 

AD 

ET 

YN 

TM 

% 

A 

B 

A 

B 

A 

B 

A 

B 

N  of  Selected  Clusters 

16 

13 

10 

12 

Sample  N 

2000 

1996 

1998 

735 

%  Sample  N  in  Clusters 

68.4 

73.8 

39.0 

73.2 

N  of  Selected  Clusters 

17 

15 

16 

13 

9 

11 

11 

17 

Sample  N 

999 

1000 

1000 

996 

999 

998 

368 

367 

%  Sample  N  in  Clusters 

72.9 

66.6 

77.1 

74.6 

38.2 

43.8 

73.6 

78.7 

N  of  Selected  Clusters 

20 

18 

14 

18 

8 

13 

Sample  N 

500 

499 

500 

499 

499 

500 

%  Sample  N  in  Clusters 

74.2 

69.3 

74.6 

80.4 

42.3 

42.8 

N  of  Selected  Clusters 

17 

16 

13 

14 

16 

16 

Sample  N 

250 

250 

250 

250 

250 

249 

%  Sample  N  in  Clusters 

71.6 

74.4 

79.2 

82.4 

52.0 

55.8 

aN  of  Selected  Clusters  refers  to  only  those  clusters  selected  by  criteria  on  page  6. 


The  matching  procedure  (described  in  Appendix  C)  produced  a  set  of  matched  clusters 
for  each  A  and  B  pair  of  independent  samples  of  1000,  500,  and  250,  as  well  as  for  the  pair 
of  TM368  samples. 


Cluster  Stability  by  Type  of  Scale 


When  comparing  all  three  job  description  profiles  across  AD1000  matched  clusters, 
stability  was  found  to  be  very  high  for  the  MP  and  TSM  profiles  (see  Table  7),  but  very  low 
for  the  TSMP  profile.  As  Table  7  shows,  the  mean  values  for  the  MP  £AB,  TSM  £AB,  and 


TSMP  rAB  coefficients  were  .89,  .90,  and  .17. 


These  results,  across  clusters,  are  highly 


similar  to  the  results  across  pay  grades  already  reported  (on  page  8).  Because  of  these 
results,  and  the  finding  that  MP  and  TSM  profiles  were  highly  correlated  (see  Table  4),  it 
was  decided  to  evaluate  cluster  stability  only  on  the  basis  of  the  MP  profile. 


Cluster  Stability  by  Sample  Size 

Correlational  Results.  The  high  MP  £AB  average  values  obtained  for  samples  of  1000 

(see  Table  8  and  the  analytical  design  described  in  Appendix  C)  indicate  the  following 
relationships:  (1)  high  stability  for  clusters  derived  from  independent  samples  of  1000,  (2) 
high  stability  for  clusters  from  total  samples  of  2000,  since  highly  similar  clusters  (which 
were  counterparts  of  total  sample  clusters)  were  found  in  both  half  samples  of  1000,  and 
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Table  7 


Similarity  of  3ob  Description  Profiles  Across 
Matched  Clusters  for  the  AD  1000  Paired  Samples 


Matched  Clusters 

Cluster  ID//  From  Sample 

and  membership  A  B 

MP 

Stability  Index  r^g 

TSM 

TSMP 

if 

1 

4 

.98 

.98 

.08 

N 

103 

67 

if 

5 

2 

.92 

.94 

.15 

N 

34 

39 

if 

2 

5 

.80 

.81 

.21 

N 

92 

32 

if 

7 

6 

.73 

.70 

.04 

N 

44 

29 

if 

3 

1 

.96 

.96 

.06 

N 

71 

97 

if 

6 

3 

.96 

.96 

.26 

N 

45 

51 

if 

8 

8 

.93 

.94 

.22 

N 

20 

69 

if 

10 

9 

.99 

.99 

.20 

N 

109 

115 

if 

13 

11 

.93 

.95 

.15 

N 

21 

20 

if 

12 

13 

.94 

.92  ' 

.20 

N 

24 

36 

if 

14 

7 

.59 

.64 

.06 

N 

28 

12 

if 

16 

14 

.82 

.85 

.07 

N 

22 

11 

if 

15 

12 

.93 

.94 

.41 

N 

31 

47 

if 

17 

15 

.95 

.96 

.32 

N 

39 

29 

Mean  = 

.89 

.90 

.17 

Note.  Data  presented  are  for  those  matched  clusters  selected  only  by  criterion  on  p.  6. 
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Table  8 


Average  Stability  Values  of  Members  Performing  (MP)  Profile  for  Matched 
Clusters  from  Reduced  Samples 


Rating  Index* 

Sample  Size 

1000 

500 

368 

250 

AD  Av.  £ab  and  Range 

(for  Matched  Clusters) 

Av.  N  tasks 

Av.  r^g  (All  Clusters) 

Av.rTAandrTB 

N  Cluster  Pairs 
(1st  Search) 

N  Cluster  Pairs 
(2nd  Search) 

.89  .59-. 98 

281 

.25 

.95  .93 

14 

2 

.82  .56-. 98 

240 

.16 

.93  .89 

15 

1 

.73  .37-. 94 

240 

.12 

.88  .88 

12 

4 

ET  Av.  £ab  and  Range 

(for  Matched  Clusters) 

Av.  N  Tasks 

Av.  £ab  (All  Clusters) 
Av.rTA  andrTB 

N  Cluster  Pairs 
(1st  Search) 

N  Cluster  Pairs 
(2nd  Search) 

.87  .69-. 98  . 

415 

.48 

.90  .94 

11  .94 

2 

.82  .62-. 96 

428 

.39 

.86  .91 

10 

3 

.78  .48-. 96 

395 

.32 

.89  .87 

10 

3 

YN  Av.  r AR  and  Range 

(for  Matched  Clusters) 

Av.  N  Tasks 

Av.  £ab  (All  Clusters) 

Av.rTAandrTB 

N  Cluster  Pairs 
(1st  Search) 

N  Cluster  Pairs 
(2nd  Search) 

.89  .75-. 97 

375 

.50 

.93  .94 

8 

2 

.80  .47-. 96 

338 

.40 

.84  .89 

7 

3 

.62  .30-. 96 

291 

.22 

.78  .81 

9 

1 

TM  Av.  r_AB  and  Range 

(for  Matched  Clusters) 

Av.  N  Tasks 

Av.  £ab  (All  Clusters) 
Av.rTAandrTB 

N  Cluster  Pairs 
(1st  Search) 

N  Cluster  Pairs 
(2nd  Search) 

1 

.80  .  50-. 97 

224 

.24 

.92  .90 

10 

2 

_ 

Notes. 


1.  To  evaluate  the  relative  magnitude  of  the  Av.  rAB  for  Matched  Clusters,  the  rAB  index  was  also  calculated  between  each  selected 
cluster  of  sample  A  with  each  selected  cluster  of  sample  B.  The  average  of  these  values  is  displayed  as  Av.  r  AB  (All  Clusters). 

2.  All  displayed  indices  were  calculated  only  for  clusters  selected  by  1st  search  criteria  (see  page  6). 

3.  N  of  Cluster  Pairs  is  the  number  of  pairs  of  matched  clusters  selected  by  the  1st  and  2nd  search  criteria  (see  2nd  search  criteria  on 
page  C-2  and  C-3). 

aFor  calculation  of  Av.  r  AB,  Av.  N  Tasks,  and  Av.  r jg,  see  page  7. 
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(3)  minimal  differences  between  clusters  from  samples  of  1000  and  total  sample  clusters, 
since  the  1000  size  clusters  were  counterparts  of  the  total  sample  clusters. 


By  the  MP  r^  index,  cluster  stability  declined  as  sample  size  was  reduced,  and 

dropped  noticeably  from  sample  sizes  of  1000  to  250  (see  Av.  £^g  for  Matched  Clusters  in 

Table  8).  For  example,  the  average  MP  £^g  index  for  the  YN1000,  500,  and  250  samples 

dropped  from  .89  to  .80  to  .62.  The  difference  in  MP  £^g  average  values  between  clusters 

from  samples  1000  and  250  (ranging  from  about  9  to  27  correlation  points)  is  substantial, 
considering  that  the  smaller  samples  are  also  contained  in  the  larger  samples. 

Similar  trends  may  be  observed  for  the  MP  £^^  and  MP  £yg  indices  (see  Av.  £^^  and 

£yg  in  Table  8).  For  example,  the  Av.  £^^  values  for  the  AD1000,  500,  and  250  samples 

decreases  from  .95  to  .93  to  .88.  The  dependence  between  each  reduced  sample  (A  or  B) 
and  the  total  sample,  however,  appears  to  maintain  these  values  higher  than  average  MP 

-AB  values.  (The  MP  ^TA’  anc*  -TB  va*ues  *or  eac^  Pa*r  °*  matched  clusters  derived 

from  samples  of  1000  are  displayed  in  Appendix  G,  Tables  G-l  through  G-3). 


Number  of  Tasks  Performed.  A  substantial  loss  of  task-performed  information  (i.e.,  a 
drop  in  the  number  of  tasks  performed)  occurred  for  matched  clusters  from  samples  of 
250,  compared  with  samples  of  1000.  For  example,  the  number  of  tasks  performed  (see 
the  index,  Av.  N  Tasks,  in  Table  8)  for  the  ADI 000  and  AD250  samples  dropped  from  281 
to  240,  and  for  the  YN1000  and  250  samples,  from  375  to  291. 


Percent  of  Common  Membership.  With  some  exceptions,  the  clusters  evaluated  by 
this  index  were  found  to  be  moderately  to  highly  stable,  thus  supporting  further  the 
stability  demonstrated  by  the  correlation  indices  above.  Tables  G-l  and  G-2  display 
values  for  the  Percent  of  Common  Membership  calculated  on  the  MP  and  TSM  profiles  for 
the  AD  1000  and  YN1000  samples.  For  the  AD  matched  clusters,  the  total  index  value 
calculated  on  the  MP  profile  was  75.7  percent,  and  on  the  TSM  profile,  81.1  percent  (see 
Table  G-l).  Some  of  the  values  are  low,  however,  especially  for  the  YN  samples  (see 
Table  G-2).  The  sizes  of  these  percentages  appear  to  have  been  lowered  due  to  error  of 
individual  data  (as  distinguished  from  average  profile  data  in  the  clusters)  and  due  to  the 
dependence  among  matched  clusters  within  each  sample  (see  Appendix  C,  page  C-4  for 
further  explanation).  For  the  correlation  indices,  no  corresponding  decrease  occurred 
because,  for  those  indices,  average  profile  data  were  used,  and  each  pair  of  matched 
clusters  was  analyzed  separately.  It  should  also  be  noted  that  the  values  of  the  Percent  of 
Common  Membership  for  the  MP  and  TSM  profiles  were  highly  similar  (see  Tables  G-l  and 
G-2). 


Cluster  Stability  by  Membership  Size 

Figure  3  demonstrates  a  substantial  drop  in  stability  for  the  MP  £^g  index  when 
cluster  membership  (the  number  of  incumbents  within  a  cluster)  was  less  than  about  20. H 


’’Carpenter  (1974)  reported  high  stability  for  Task-Performed  Data  for  clusters  with 
membership  greater  than  10.  The  coefficients  were  calculated  between  overlapping 
clusters  (i.e.,  stability  was  determined  by  comparing  smaller  clusters  to  larger  clusters 
that  contained  the  smaller  clusters).  Thus,  the  values  would  be  overestimates. 


AVERAGE  MEMBERSHIP  N  OF  CLUSTER 

Stability  of  clusters  by  membership  size 


DISCUSSION 


Issues  Pertaining  to  Properties  of  Inventory  Scales 

Effect  of  Zero  Scores  on  Relative  Time-Spent  Scale 

High  stability  was  generally  demonstrated  for  the  TSM  (Average  Percent  Time-Spent 
by  all  Members)  profile,  and  low  stability  for  the  TSMP  profile  (Average  Percent  Time- 
Spent  by  only  Members  Performing)  (see  Table  3).  While  both  profiles  were  calculated 
from  responses  to  the  Relative  Time-Spent  Scale,  only  the  calculation  of  the  Average 
Percent  (i.e.,  Average  Relative  Time)  scores  for  the  TSM  profile  included  zero  scores  for 
those  incumbents  in  the  sample  who  did  not  perform  a  task.  The  inclusion  of  zeros  results 
in  a  substantial  drop  in  the  standard  deviation  between  task  scores  (as  observed  in  the 
difference  between  the  TSMP  and  TSM  standard  deviation  values  in  Table  3),  and  an 
apparent  tendency  for  all  TSM  profiles  (i.e.,  score  distributions)  to  be  positively  skewed. 
A  consistent  shape  in  score  distributions  is  reflected  in  the  high  correlation  coefficients 
obtained  between  TSM  profiles. 

Validity  of  Scale  Responses 

It  is  noted  parenthetically  that  the  present  study  did  not  include  a  validation  of  either 
of  the  two  scales  (Task-Performed  or  Relative  Time-Spent  scales)  on  any  external  criteria 
(e.g.,  Subject  Matter  Expert  judgments).  Conclusions  of  other  studies  regarding  the 
validity  of  the  Relative  Time-Spent  scale  responses  have  not  been  consistent  (Hartley, 
Brecht,  Pagery,  Weeks,  Chapanis,  <5c  Hoecker,  1977,  vs.  Carpenter,  Giorgia,  &  McFarland, 
1975;  McCormick,  1976).  Using  instructors'  daily  recordings  of  the  time  spent  on  tasks  as 
the  criterion,  Carpenter  et  al.  (1975)  reported  findings  that  indicate  that  responses  by 
U.S.  Air  Force  trainees  on  the  Relative  Time- Spent  scale  were  highly  valid,  regardless  of 
the  number  of  scale  steps  (e.g.,  5  vs.  9  steps).  As  evidence,  they  reported  that  the 
difference  between  the  Relative  Time- Spent  profile  for  trainees  and  instructor  estimates, 
when  converted  to  percentages,  averaged  about  1  percentage  point  on  each  task.  There  is 
a  serious  limitation  to  that  kind  of  validation  study,  however,  when  comparing  the 
CODAP-generated  TSMP  profile  to  a  criterion  that  also  consists  of  percentage  of  time 
spent.  That  is,  if  the  number  of  tasks  responded  to  on  both  profiles  exceeds  100,  then  it  is 
likely  that  most  percentages  being  compared  will  be  very  small,  often  about  1  percent  or 
less.  Furthermore,  such  small  percentages  will  result  for  any  profile,  regardless  of  the  set 
of  profile  tasks.  In  the  Carpenter  et  al.  study,  since  the  subjects  were  in  a  basic  training 
program,  it  would  be  highly  likely  that  both  trainee  and  instructor  would  respond  to  most 
of  the  130  training  tasks,  thereby  increasing  the  likelihood  that  most  Time-Spent 
percentages  being  compared  would  be  very  small,  often  below  1  percent.  Thus,  an  error 
(or  percent  difference)  averaging  about  1  percent  per  task  between  the  two  profiles  could 
be  a  relatively  large  error  that  might  yield  very  low  correlation  values,  if  the  relative 
order  of  the  Time-Spent  on  tasks  was  analyzed  (as  was  performed  in  the  present  study). 

Hartley  et  al.  (1977)  compared  job  incumbent  estimates  of  time  spent  on  23  work 
activities  with  on-site,  recorded  observations  of  the  actual  Time-Spent.  They  found  an 
Average  Time-Spent  difference  of  about  24  percent  and  concluded  that  the  accuracy  of 
incumbent  estimates  is  "suggestive  at  best,"  and  that  on-site  observation  may  be  more 
appropriate.  (The  small  sample  of  12  office  workers,  however,  raises  a  question  as  to  the 
stability  of  the  average  error  obtained.) 

It  is  noted  that  the  Time-Spent  values  in  the  Hartley  et  al.  study  were  based  on 
worker's  estimates  of  absolute  time  spent  (hours  or  minutes,  or  percentages  of  a  specific 
time  period),  which  were  then  converted  to  relative  time  for  the  total  observation  period. 
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By  contrast,  Carpenter  et  aJ.  reported  that  absolute  Time-Spent  values  converted  to 
percentages  (by  a  slightly  different  procedure  from  Hartley  et  al.)  were  as  accurate  (by 
the  instructor  criterion)  as  the  other  Relative  Time-Spent  Percentage  estimates.  The 
validity  of  Relative  Time-Spent  estimates  by  incumbents  appears  to  be  questionable. 
Hartley  et  al.,  however,  did  report  that  incumbents  can  accurately  rank-order  tasks  in 
terms  of  time  spent.  Also,  they  demonstrated  that  incumbents  were  very  accurate  in 
identifying  the  tasks  that  they  performed,  thus  providing  valid,  task-performed  data. 

Minimal  Information  Gain  from  Relative  Time-Spent  Responses 

It  is  reasonable  to  expect  a  finding  of  high  similarity  in  the  rank-order  of  tasks  for 
TSM  and  MP  profiles  (results  in  Tables  3  and  7).  (A  similar  finding  is  reported  in 
Carpenter,  1970  As  the  percent  of  members  performing  a  task  increases,  the  value  of 
the  average  percent  of  time  spent  by  all  members  on  that  task  will  be  based  on  less  zero 
scores,  and  thus  also  increase.  These  results  indicate  that  the  use  of  either  profile  in 
correlational-(or  order)-type  analyses  will  yield  very  similar  results. 

Disadvantages  of  the  Relative  Time-Spent  Responses 

For  all  ratings  analyzed,  extremely  small  values  were  obtained  for  the  TSM  profile 
scores — often  less  than  one-half  of  a  percent  (see  examples  in  Appendix  E).  This  result 
makes  meaningful  interpretation  of  Relative  Time-Spent  per  task  data  difficult.  In 
informal  discussions.  Navy  managers  who  use  task  information  reported  little  use  of  the 
Relative  Time-Spent  scores. 

Cragun  and  McCormick  (1967)  reported  two  other  disadvantages  of  a  Time-Spent 
scale.  First,  military  officer  job  incumbents  evaluated  a  Time-Spent  scale  less  favorably 
than  other  standard  response  scales  (e.g.,  importance-to-job  scale).  Second,  Cragun  and 
McCormick  estimated  that  the  job  incumbents  were  able  to  mark  only  three  or  four  tasks 
per  minute  on  the  Time-Spent  scale.  Using  a  three  per  minute  estimate,  it  would  take 
enlisted  personnel  approximately  2.5  hours  to  mark  only  450  tasks  out  of  the  800  to  1000 
items  in  a  standard  inventory.  (Cragun  <5c  McCormick  also  reported  a  test-retest 
correlation  of  about  .60  for  responses  to  a  9-point  Time-Spent  scale.) 

.  Substantial  savings  in  inventory  administration  time,  with  little  or  no  loss  of  useful 
information,  would  be  realized  if  personnel  samples  marked  only  a  Task-Performed  scale 
and  not  also  the  Relative  Time-Spent  scale.  Further,  Task-Performed  responses  could 
routinely  be  derived  from  marks  vs.  no  marks  on  another  scale  that  is  already  a  standard 
part  of  NOTAP  inventories,  the  Involvement  scale.  (This  scale  is  a  4-point  scale 
indicating  the  type  of  job  involvement— supervising,  doing,  supervising  and  doing,  or 
assisting— with  each  task.) 

Use  of  Alternative  Scales  to  Derive  Clusters 


While  the  CODAP  cluster  analysis  procedure  operated  on  individual  Relative  Time- 
Spent  scores,  results  (in  Table  7)  indicate  that  it  produced  clusters  that  are  stable  by  the 
MP  profile  (derived  from  Task-Performed  scores),  and  the  closely  related  TSM  profile,  but 
not  by  the  T5MP  profile.  This  result  suggests  that  the  procedure  may  be  essentially 
driven  by  Task-Performed  data,  not  by  the  Relative  Time-Spent  data.  Indeed,  as 
illustrated  in  Appendix  A,  the  Overlap  Between  values,  which  are  the  similarity  index 
values  for  the  clustering  procedure,  are  more  closely  related  to  the  TSM  than  to  the  TSMP 
profile.  The  data  have  clearly  demonstrated  the  close  relationship  between  the  TSM 
profile  and  the  Task-Performed  responses  (i.e.,  the  Percent  of  Members  Performing 
profile).  CODAP  options  include  a  capability  for  clustering  on  Task-Performed  responses, 


thereby  obviating  reliance  on  Relative  Time-Spent  scores.  (Another  on-going  study  is 
comparing  cluster  solutions  based  on  Task-Performed  vs.  Relative  Time- Spent  scores. 
The  obtained  similar  values  of  the  Percent  of  Common  Membership  index  calculated  on 
the  MP  profile  (i.e.,  Task-Performed  scores)  and  the  TSM  profile  (i.e.,  Relative  Time- 
Spent  scores)  as  reported  on  page  18,  suggest  that  little  difference  between  such  solutions 
will  be  found.) 

In  addition,  continuous  scale  information  for  tasks  performed  by  each  incumbent 
could  be  derived  more  economically  and  perhaps  more  reliably  by  small  samples  of  subject 
matter  experts.  These  data  could  then  be  cluster  analyzed  by  the  CODAP  system  (see 
procedure  in  Pass  and  Robertson,  1979). 

Alternative  Cluster  Selection  Criteria 


Although  other  clustering  procedures  rely  on  external  judgments  regarding  additional 
data  (e.g.,  job  title,  specialty  code,  type  unit,  pay  grade),  the  objective  method  of 
selecting  clusters  in  the  present  study  did  not.  One  criterion  that  was  employed- -using  a 
minimum  of  35  on  the  Overlap  Between  index  (Archer,  1966) — appears  to  be  useful  for 
selecting  stable  clusters. 

Utility  of  Findings 

Cost-Effective  Sampling  for  Inventory  Administration 

The  empirically  developed  relationships  (displayed  in  Figures  1  and  2)  demonstrate 
that  there  are  sample  size  ranges  beyond  which  stability  does  not  appreciably  increase 
(i.e.,  the  displayed  curves  are  sharply  asymptotic).  This  result  strongly  supports  a 
justification  to  establish  upper  limits  for  sample  size  when  collecting  Task-Performed 
data  (i.e.,  data  to  calculate  the  MP  profile).  It  should  be  emphasized  that,  in  general, 
sample  size  requirements  for  collecting  dichotomous  type  scale  data  will  be  more  than 
adequate  for  collecting  continuous  type  (e.g.,  five  point)  scale  data  (Bemis,  1978). 

For  purposes  such  as  identifying  the  inventory  tasks  that  are  performed  by  the  most 
personnel,  stable  estimates  of  only  the  relative  value  or  rank  order  (as  displayed  in  Figure 
2)  of  percentages  of  incumbents  performing  inventory  tasks  would  be  adequate.  If  stable 
estimates  of  the  actual  percentage  of  personnel  performing  tasks  are  required,  the 
relationships  displayed  in  Figure  1  can  be  applied  to  determine  an  adequate  sample  size. 
Further,  the  curves  in  Figures  1  and  2  can  be  used  interactively  to  satisfy  stability 
requirements  for  both  types  of  estimates  discussed  above.  Thus,  management  could 
specify  minimum  levels  of  stability  both  for  the  relative  order  of  the  percentages  of  MP 
tasks  and  for  the  absolute  percentage  of  members  performing  each  task.5 


5 Farrell,  Stone,  and  Yoder  (1976)  recommend  a  single  sample  size  of  about  400 
personnel  to  be  sampled  from  each  Marine  Corps  Occupational  Field.  Based  on  informal 
discussions  with  U.5.  Air  Force  investigators,  it  appears  that  determination  of  minimal 
sample  sizes  for  inventory  administration  has  not  been  performed.  Christal  (1974b) 
suggests  sampling  as  many  incumbents  in  the  population  as  possible  to  assume  an  adequate 
sample  size  for  deriving  stable  clusters  and  for  analyzing  ail  conceivable  subgroups  in  the 
population. 
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For  an  application  of  Figures  1  and  2  to  the  ET  rating  (see  Table  B-l),  sample  sizes 
could  be  determined  as  follows:  100  from  E-2,  240  each  from  E-3  through  E-6,  200  from 
E-7,  100  from  E-8,  and  40  from  E-9.  This  revised  total  sample  of  1389  reflects  a  45 
percent  reduction  compared  with  the  operational  sample  of  2546  for  which  data  were 
actually  collected.  Further,  these  revised  sample  sizes  would  improve  stability  within 
those  pay  grades  where  such  improvement  is  most  needed.  Similarly,  substantial 
reductions  in  total  sample  size  (i.e.,  a  reduction  of  about  1000),  while  increasing  overall 
stability,  could  be  achieved  for  AD  and  YN  ratings.  For  TM  (see  Table  B-l),  however,  a 
reasonable  application  of  the  data  in  Figure  1  would  indicate  a  requirement  to  increase 
sample  size  for  certain  pay  grades  as  follows:  100  for  E-2  and  E-3  combined,  240  for  E-4, 
30  for  E-8,  and  11  for  E-9  (see  Table  B-l  for  remaining  pay  grade  sample  sizes).  These 
pay  grade  increases  would  result  in  a  relatively  small  increase  of  151  personnel  for  the 
total  sample  (i.e.,  operational  total  of  735,  compared  with  the  revised  total  of  886).  For 
each  of  the  larger  rating  populations  samples  (i.e.,  populations  with  over  7000  personnel), 
records  indicate  sampling  about  1000  more  personnel  than  required.  Thus,  for  the  15 
larger  ratings,  a  reduction  of  15,000  personnel  for  inventory  administration  could  be 
realized.  Using  3.5  hours  as  an  estimate  of  time  to  administer  the  inventory,  52,500  work 
hours  (i.e.,  15,000  x  3.5  =  52,500)  could  be  saved  each  time  these  ratings  were  sampled. 
Alternatively,  additional  required  information  could  be  collected  from  the  smaller 
samples  while  still  decreasing  somewhat  the  total  work  hours  lost  to  the  operational  units. 

The  utility  of  these  findings  relies  on  the  representativeness  of  the  Navy  units 
sampled  (e.g.,  see  Table  B-2  for  AD  and  ET  ratings).  It  is  reasonable  to  expect  the 
findings  to  apply  to  occupations  judged  to  be  as  homogeneous  as  (or  more  homogeneous 
than)  pay  grades  within  a  rating.  Although  the  study  demonstrates  sample  size 
requirements  for  occupational  specialties  defined  as  Navy  ratings,  the  methods  are 
deemed  to  be  similarly  applicable  to  other  levels  of  occupational  description  (e.g.,  a  Navy 
Enlisted  Classification  Code  (NEC)  or  a  Military  Occupational  Specialty  (MOS)  of  the 
other  services). 

An  extension  of  the  methods  developed  could  be  directed  towards  the  question  of 
when  it  is  necessary  to  administer  a  subsequent  inventory  to  the  same  rating  (the  present 
Navy  cycle  is  about  4  years).  Very  small  subsamples  could  be  evaluated  to  detect  changes 
over  time  in  tasks  performed,  until  some  critical  point  is  reached  for  which  a  full  sample 
size  is  required.  This  extended  application  has  implications  for  important  decisions 
regarding  when  to  revise  occupational  standards  or  training  curricula. 

Reduced  Computer  Processing  Costs 

The  study  demonstrated  that  appreciable  drops  in  the  stability  of  cluster  solutions  did 
not  occur  until  the  total  sample  (i.e.,  the  sample  that  included  personnel  from  pay  grades 
E-2  through  E-9)  was  reduced  to  250  (see  Table  8)- -substantially  below  the  total  2000 
typically  processed  by  the  IBM  360  CODAP  procedure.  Thus,  if  total  sample  size  was 
reduced  to  1000  with  the  above  procedures,  highly  stable  clusters  could  still  be  derived. 
Further,  since  computer  processing  time  for  the  cluster  analysis  procedure  is  an 
exponential  function  of  sample  size,  and  since  the  processing  of  about  2000  cases  can 
exceed  7  hours  of  central  processing  unit  (CPU)  time  on  an  IBM  360/67  computer  and  3 
hours  on  a  UNIVAC  1108,  reducing  the  sample  by  one-half  will  substantially  reduce 
computer  time  and  costs. 

These  findings  apply  only  to  clusters  derived  from  heterogeneous  samples  of 
individual  responses  (as  distinguished  from  average  response  data)  on  about  400-600  tasks. 
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CONCLUSIONS 


1.  No  practical  gain  in  stable,  meaningful  task  information  is  achieved  from 
enlisted  job  incumbent  responses  on  the  Relative  Time- Spent  scale,  compared  with  the 
Task-Performed  scale.  More  informative,  and  more  efficiently  collected  estimates  of  the 
time  spent  per  task  could  probably  be  based  on  incumbents'  ranking  of  the  most  time- 
consuming  tasks. 

2.  Task-Performed  data;  that  is,  percentages  of  personnel  performing  tasks,  are 
highly  stable  for  samples  substantially  smaller  than  samples  previously  collected. 

3.  Substantial  data  acquisition  and  processing  costs  can  be  saved  by  using  the 
empirically-developed  relationships  to  determine  minimal  sample  sizes  that  optimize 
stability. 


RECOMMENDATIONS 


It  is  recommended  that: 

1.  The  Relative  Time-Spent  scale  be  deleted  from  future  task  inventories  to  reduce 
substantially  administration  time. 

2.  Alternative  methods  of  estimating  time  spent  performing  tasks,  including 
ranking  the  most  time-consuming  tasks,  be  used  on  a  trial  basis  in  task  inventory  surveys. 

3.  Responses  to  a  currently  administered  inventory  scale  (see  page  21)  be  used  to 
calculate  the  percentage  of  incumbents  performing  tasks. 

4.  The  study's  empirically-developed  guidelines  be  used  as  an  aid  to  determine 
minimal  sample  sizes  required  for  stable  job  analysis  information. 
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DESCRIPTION  OF  INVENTORY  SCALES,  CODAP  JOB  DESCRIPTION 
PROFILES,  AND  CODAP  CLUSTERING  PROCEDURE 


Scales 


1.  Relative  Time-Spent— A  five-point  Likert-type  scale  of  time  spent  performing  a 
task  relative  to  other  job  tasks,  with  scale  points  ranging  from  "very  much"  through 
"average"  to  "very  little."  (While  the  Navy's  task  analysis  program  employed  the  five- 
point  Time-Spent  scale,  other  military  services  use  a  seven-  or  nine-point  scale.) 

2.  Relative  Time-Spent  Percentage— This  is  not  a  true  "response"  scale;  rather,  it  is 
a  conversion  of  the  Relative  Time-Spent  scale  responses  to  percentages  that  sum  to  100 
percent  for  all  tasks  performed  by  one  individual.  A  simplified  illustration  for  five  tasks 
(versus  the  usual  400  to  600  tasks)  is  presented  below: 

Relative  Relative  Time- 

Task  Time-Spent  Response  Spent  Percentage 

1  1  (very  little)  10 

2  3  (average)  30 

3  1  10 

4  1  10 

5  4  (above  average)  40 


10  100% 

3.  T ask- Per f or med—  A  dichotomous  (or  two  point)  scale  on  which  a  "1"  indicates 
task  performed;  and  a  "0",  task  not  performed.  A  job  incumbent's  mark  versus  no  mark  on 
some  point  of  the  Relative  Time-Spent  scale  converts,  respectively,  to  scores  of  1  or  0  on 
the  Task-Performed  scale. 

Job  Description  Profiles 

1.  MP— Percent  of  Members  Performing  (the  task)-- the  percentage  of  scores  of  "1" 
on  the  Task- Performed  scale  for  each  inventory  task  for  a  particular  sample  or  cluster  of 
individuals  (i.e.,  the  term  cluster  refers  to  a  mathematically  derived  group  of  incumbents 
who  perform  similar  work  tasks). 

2.  TSM— Average  Percent  Time-Spent  by  Ail  Members— the  average  of  Relative 
Time-Spent  percentages  across  all  incumbents  in  the  sample  or  cluster  for  each  task  in 
the  inventory. 

3.  TSMP— Average  Percent  Time-Spent  by  Members  Performing  (the  task)— the 
average  of  Relative  Time-Spent  percentages  across  only  those  respondents  in  the  sample 
or  cluster  actually  performing  each  task  (as  indicated  by  a  response  on  one  of  the  Relative 
Time-Spent  scale  points). 

Clustering  Similarity  Index 

la.  Overlap  Between— Individuals.  The  sum  of  the  smaller  of  the  two  percentages  in 
the  comparison  of  two  incumbent's  Relative  Time- Spent  percentages  on  tasks.  Example: 
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Incumbent 

Percent 

A 

B 

Overlap 

10 

100 

10% 

90 

0 

0% 

Overlap 

Between  = 

10% 

lb.  Overlap  Between—Clusters.  The  average  of  the  Overlap  Between  values  for 
each  individual  in  one  cluster  with  each  of  the  individuals  in  the  other  cluster.  The 
Overlap  Between  Index  is  the  similarity  measure  used  by  the  CODAP  clustering  procedure 
(described  below).  It  should  be  emphasized  that  the  values  of  this  index  do  not  reflect 
mean  or  level  differences  between  Relative  Time-Spent  percentages  for  tasks  as  much  as 
would  values  based  on  a  distance  measure  (see  Cronbach  &  Gleser,  1953)  or  the  values  of 
the  TSMP  profile  and  the  TSM  profile.  The  difference  in  the  information  contained  in 
these  measures  is  illustrated  by  the  following  example: 


Task 

Relative  Time-Spent 
by  Incumbent 

Percent 

Overlap 

Task 

Distance 

Average  Percent 
Time-Spent 

A 

B 

TSMP 

TSM 

1 

10 

50 

10 

40 

30 

30 

2 

10 

10 

10 

0 

10 

10 

3 

70 

10 

10 

60 

40 

40 

4 

10 

20 

10 

10 

15 

15 

5 

0 

10 

0 

10 

10 

5 

Overlap  Total 

Between =40  Distance =120 

The  varying  differences  between  the  A  and  B  Relative  Time-Spent  values  are  not 
reflected  in  the  Percent  Overlap  or  Overlap  between  values  as  they  are  in  the  distance 
and  Average  Percent  Time-Spent  values.  Furthermore,  previous  research  (Hamer,  1976) 
on  the  comparability  of  similarity  indices  indicates  very  high  comparability  (r  =  .90  df  = 
48)  between  Overlap  Between  values  and  Pearson  correlation  coefficients  used  to  measure 
similarity  between  jobs.  It  should  also  be  emphasized  that  the  TSM  values  will  be  almost 
always  more  closely  related  than  TSMP  values  to  the  Overlap  Between  values  (i.e., 
summed  percent  overlap  values).  That  is,  a  zero  percent  overlap  value  for  a  task  will 
correspond  to  a  TSM  value  that  is  closer  to  zero  than  the  TSMP  value. 

Clustering  Procedure 

The  CODAP  clustering  program  is  based  on  the  Ward  hierarchical  cluster  analysis 
procedure  (Ward,  1961;  Christal  &  Ward,  1967).  The  procedural  steps  are  outlined  below. 

1.  Calculate  Overlap  Between  values  for  all  possible  pairs  of  job  incumbents  (see 
sample  matrix  below). 


Overlap  Between  Matrix 

Incumbent 

A 

B 

c 

D 

A 

100 

10 

30 

50 

B 

10 

100 

40 

70 

C 

30 

40 

100 

60 

D 

50 

70 

60 

100 

2.  Combine  (cluster)  the  two  incumbents  with  the  highest  Overlap  Between  value 
(in  the  above  matrix,  incumbents  B  and  D  would  be  clustered). 

3.  Continue  to  combine  individuals  and/or  clusters  by  highest  (average,  if  clusters) 
Overlap  Between  percentages,  for  a  number  of  stages  equal  to  N-l  incumbents,  until  all 
incumbents  have  been  clustered  into  one  total  group.  This  agglomerative  procedure 
results  in  a  hierarchial  solution;  that  is,  the  smaller  clusters  are  subsumed  by  larger 
clusters. 

4.  For  each  cluster  derived,  calculate  an  Overlap  Within  index  value  as  an  indicator 
of  cluster  homogeneity.  This  index  is  the  average  of  Overlap  Between  values,  including 
redundant  and  diagonal  values,  for  individuals  contained  in  a  cluster.  Given  the  above 
sample  Overlap  Between  matrix,  the  Overlap  Within  for  a  cluster  containing  individual  C 
and  D  would  equal  (100  +  60  +  60  +  ldO)  *  4  =  80  percent.  It  should  be  noted,  however,  that 
the  inclusion  of  diagonal  values  in  the  calculation  of  the  Overlap  Within  index  will  cause 
those  index  values  always  to  be  higher,  and,  at  times  (depending  on  of  cluster 
membership  and  Overlap  Between  values),  substantially  higher,  than  the  Overlap  Between 
values.  This  instability  of  the  Overlap  Within  index  is  illustrated  by  clusters  obtaining 
very  similar  Overlap  Between  values  but  very  different  Overlap  Within  values,  as 
displayed  in  typical  output  of  CODAP's  (OVLGRP)  program.  Thus,  the  Overlap  Between 
index,  and  not  the  Overlap  Within  index,  was  used  as  an  indicator  of  homogeneity  for 
selecting  clusters  for  stability  evaluation. 
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Table  B-l 


Population  and  Total  Sample  Size  by  Pay  Grade 
for  Selected  Ratings 


Population 

Total  Sample 

Pay  Grade 

N 

%  Pop 

N 

%  Sample 

AD  Rating3 

2 

1278 

8.9 

135 

5.3 

3 

1525 

10.7 

297 

11.7 

4 

3390 

23.7 

565 

22.3 

5 

3384 

23.7 

674 

26.6 

6 

2666 

18.6 

562 

22.1 

7 

1276 

8.9 

215 

8.5 

8 

430 

3.0 

62 

2.4 

9 

347 

2.4 

40 

1.6 

Total 

14296 

99.9 

2550 

100.0 

ET  Rating^ 


2 

_ 

„ 

3 

874 

9.6 

208 

8.2 

4 

2492 

27.5 

748 

29.4 

5 

3001 

33.2 

797 

31.3 

6 

1653 

18.3 

506 

19.9 

7 

666 

7.4 

197 

7.7 

8 

237 

2.6 

66 

2.6 

9 

127 

1.4 

24 

0.9 

Total 

9050 

100.0 

2546 

100.0 

TM  Rating 


2 

3 

4 

982 

39.1 

205 

27.9 

5 

756 

30.1 

251 

34.1 

6 

521 

20.7 

183 

24.9 

7 

182 

7.2 

71 

9.7 

8 

61 

2.4 

21 

2.9 

9 

11 

0.4 

4 

0.5 

Total 

2513 

99.9 

735 

100.0 

YN  Rating 


2 

_ 

_ 

_ 

_ 

3 

1607 

16.3 

415 

15.0 

4 

2609 

26.5 

852 

30.7 

5 

2246 

22.8 

680 

24.5 

6 

1758 

17.9 

485 

17.5 

7 

1228 

12.5 

266 

9.6 

8 

303 

3.1 

53 

1.9 

9 

96 

1.0 

21 

0.8 

Total 

9847 

100.1 

2772 

100.0 

Notes. 


1.  Population  refers  to  number  of  personnel  (not  billets)  in  rating.  Total  samples  were 
provided  by  NODAC. 

2.  Sample  Ns  exclude  personnel  in  instructor  and  student  billets. 

3.  Pay  grade  I  (E-l)  personnel  are  not  sampled  since  they  do  not  have  a  rating. 

4.  Available  records  of  population  sampled  showed  Ns  combined  for  pay  grades  2 
and  3  for  ET  and  YN  ratings,  and  combined  for  pay  grades  2  through  4  for  TM  rating. 
Thus,  the  sample  Ns  for  these  pay  grades  are  similarly  combined. 


aFor  pay  grade  9,  only  ADs  (N  =  28),  not  AMs  (N  =  12),  were  analyzed. 

^Ns  exclude  nuclear  plant  operators  and  supervisors.  Total  sample  included  late 
processed  data  for  87  personnel. 
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Table  B-2 


Types  of  Activities  Represented  in  the  AD  and  ET 
Rating  Samples 


Number  of 

Number  of 

Activity  Type 

Activities 

Personnel 

AD  Rating  Sample 

Attack  Aircraft  Carrier  (CVA) 

1 

4 

FAU  COMNAVAIRPAC 

1 

1 

NAVAIREWORKFAC 

2 

9 

Naval  Air  Facility,  Washington,  DC 

1 

26 

Naval  Air  Reserve  Units  (NARU) 

6 

29 

Naval  Air  Stations  (NAS) 

10 

564 

Naval  Air  Training  Center  (NATC) 

1 

62 

NAV  Missile  Center  Point  Mugu 

1 

35 

Pacific  Missile  Range 

1 

5 

COMNAVAIRPAC  NALCO  COMP 

1 

2 

PATWING  II 

1 

2 

LATW1NGPAC 

1 

1 

Helicopter  Combat  Support  Squadron  (HC) 

8 

76 

Helicopter  Anti-Submarine  Squadron  (HS) 

8 

61 

Helicopter  Mine  Countermeasures  Squadron  (HM) 

1 

23 

Helicopter  Anti-Submarine  Squadron,  Light  (HSL) 

7 

70 

Reconnaissance  Squadron  (RVAH) 

2 

33 

Attack  Squadron  (VA) 

19 

271 

Land  Based  Weather  Reconnaissance  Squadron  (VW) 

1 

26 

Patrol  Squadron  (VP) 

17 

239 

Fighter  Squadron  (VF) 

16 

181 

Fleet  Composite  Squadron  (VC) 

6 

58 

Air  Anti-Submarine  Squadron  (VS) 

10 

128 

Photographic  Squadron  (VFP) 

2 

17 

Carrier  Airborne  Early  Warning  Squadron  (VAW/RVAW) 

6 

89 

Fleet  Air  Reconnaissance  Squadron  (VQ) 

1 

13 

Fleet  Tactical  Support  Squadron  (VR) 

4 

82 

Tactical  Electronic  Warfare  Squadron  (VAQ) 

5 

62 

Fleet  Tactical  Support  Squadron  (VRC) 

1 

9 

Aircraft  Ferry  Squadron  (VRF) 

2 

10 

Note.  Data  for  AD  rating  sample  from  Halnon,  T.  D.  and  Gongloff,  R.  P. 
Occupational  Analysis  of  the  Aviation  Machinist's  Mate  (AD)  and  Master  Chief  Aircraft 
Maintenanceman  (AFCM)  Ratings  (Tech.  Rep.  NOTAP  76-3).  Washington,  DC: 
Occupational  Task  Analysis  Program,  December  1975.  Data  for  ET  sample  from  NOTAP 
unpublished  report,  Occupational  Analyses  of  the  ET  ratings. 

aEighteen  cases  were  dropped  due  to  data  deficiencies;  total  analyzed:  2550. 

bSix  cases  were  dropped  due  to  data  deficiencies;  total  analyzed:  2459  plus  late 
processed  data  for  87  personnel  for  total  N  =  2546. 


Table  B-2  (Continued) 


Activity  Type 

Number  of 
Activities 

Number  of 
Personnel 

Air  Test  and  Evaluation  Squadron  (VX) 

2 

23 

Antarctic  Development  Squadron  (VXE) 

1 

22 

HELTRARON 

2 

73 

TRARON 

8 

213 

TRAWING 

1 

1 

NAVFITWEPSCOL 

1 

8 

Total 

158 

2568a 

ET  Rating  Sample 

Auxiliary  Ships  (AD/AG/AGDE(AGFF)/AGDS/AGSS/ 

ARS/AS/ASR/ATF/ATS/AVM) 

34 

403 

Underway  Replenishment  Group  (AE/AFS/AO/AOR) 

8 

29 

Cruisers  (CG/CGN) 

11 

105 

Aircraft  Carriers  (CVA/CVAN) 

8 

228 

Destroyers  (DD/DDG) 

40 

214 

Escort  Ships  (DE(FF)/DEG(FFG)) 

21 

95 

Amphibious  Warfare  Ships  (LKA/LPA/LPD/LPH/ 

LSD/LST) 

21 

119 

Mine  Warfare  Ships  (MSC/MSO) 

4 

13 

Patrol  Ships  (PG) 

2 

3 

Submarines  (SS/SSN) 

25 

97 

Submarines— Ballistic  Missile  (SSBN) 

26 

120 

Communications  Stations 

19 

420 

Naval  Air  Stations 

15 

267 

Small  Craft/Shore  Duty  Elements 

8 

(33) 

DATC 

3 

33 

Training  Centers 

7 

34 

Squadrons/Staffs/Commands 

6 

19 

MOTU 

5 

75 

Naval  Stations 

7 

58 

Miscellaneous 

19 

100 

Total 

289 

2465b 

Note.  Data  for  AD  rating  sample  from  Halnon,  T.  D.  and  Gongloff,  R.  P. 
Occupational  Analysis  of  the  Aviation  Machinist's  Mate  (AD)  and  Master  Chief  Aircraft 
Maintenanceman  (AFCM)  Ratings  (Tech.  Rep.  NOTAP  76-3).  W ashington,  DC: 
Occupational  Task  Analysis  Program,  December  1975.  Data  for  ET  sample  from  NOTAP 
unpublished  report,  Occupational  Analyses  of  the  ET  ratings. 

aEighteen  cases  were  dropped  due  to  data  deficiencies;  total  analyzed:  2550. 

bSix  cases  were  dropped  due  to  data  deficiencies;  total  analyzed:  2459  plus  late 
processed  data  for  87  personnel  for  total  N  =  2546. 
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RATIONALE  AND  PROCEDURES  FOR  DETERMINING  CLUSTER 
SOLUTION  STABILITY 


Rational  for  Matching  Clusters 

To  determine  cluster  solution  stability,  the  study  executed  a  design  analogous  to  that 
used  or  recommended  for  evaluating  stability  of  factor  analysis  solutions  (Aleamoni,  1973; 
Armstrong  <5c  Soelberg,  1968;  Harman,  1967;  Tucker,  1951).  Essentially,  this  study's  design 
consisted  of  two  steps: 

1.  Matching  clusters  (factors  are  operated  on  in  the  analogue)  across  independent 
solutions  on  the  basis  of  similarity  to  the  total  sample  solutions  (a  description  of  the 
matching  procedure  appears  in  the  next  section  of  this  appendix). 

2.  Determining  the  degree  of  similarity  between  total  sample  clusters  and  matched 
clusters  from  reduced  samples,  as  well  as  between  matched  clusters.  (The  similarity 
between  clusters  was  measured  by  the  indices  described  on  pages  7  and  8  in  the  text.) 

Measures  of  similarity  between  total  and  reduced  sample  clusters  will  yield  spuriously 
high  results,  since  reduced  sample  data  are  also  contained  in  the  total  sample  data  (i.e., 
samples  are  not  independent).  This  spuriousness,  however,  is  not  present  in  the  measure 
of  similarity  between  matched  clusters.  High  similarity  between  matched  clusters  for 
two  independent  samples  demonstrates  that  a  stable,  recurrent  pattern  (i.e.,  cluster 
solution)  exists  across  the  data  from  samples  as  well  as,  of  course,  in  the  combined- 
sample  data. 

Cluster  Matching  Procedure  (in  5  Steps) 

Step  1 


For  each  rating  and  each  pair  of  independent  samples,  an  intercorrelation  matrix  of 
product  moment  coefficients  (rs)  was  calculated.  The  selected  clusters  derived  from  one 
of  the  total  samples  analyzed 7i.e.,  AD2000,  ET2000,  YN2000,  or  TM735)  marked  the  row 
dimension  of  the  matrix,  and  the  selected  clusters  derived  from  the  independent  samples 
marked  the  column  dimension  (see  the  criteria  for  selecting  clusters  on  page  6,  and  the 
sample  matrix  below). 


Independent  Sample  Clusters 

Total  Sample  Clusters  Sample  A  Sample  B 


1 

2 

3 

4 

1 

2 

3 

4 

1 

90 

85 

60 

(45) 

40 

80 

(70) 

35 

2 

85 

90 

35 

40 

90 

85 

30 

71 

3 

95 

70 

55 

40 

72 

98 

60 

75 

4 

70 

60 

90 

50 

75 

70 

76 

97 

The  correlations  were  performed  on  the  Percent  of  Members  Performing  (MP)  job 
description  profile  between  clusters.  In  the  calculation  of  the  coefficients,  tasks  were 
treated  as  cases,  and  the  percentages  of  members  performing  tasks  were  treated  as 
scores.  Scores  of  zero  on  corresponding  tasks  for  any  two  cluster  profiles  were  deleted 
from  the  calculation.  With  this  correlational  model,  complete  independence  of  scores  did 
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not  exist.  That  is,  the  same  individuals  provided  responses  for  calculation  of  a  perce. itage 
(i.e.,  score)  for  more  than  one  task.  Cragun  and  McCormick  (1967)  report,  however,  only 
minor  inflation  for  coefficients  derived  with  this  same  model.  This  correlational  model  is 
identical  to  that  used  to  derive  r^  and  £yg  values  for  matched  clusters.  In  fact,  the 

and  values  were  generated  by  this  matching  procedure. 


Step  2 


A  cluster  in  each  independent  sample  was  identified  for  matching  if  it  obtained  an  £ 
that  was  both  the  largest  £  for  a  row  (i.e.,  for  a  total  sample  cluster)  and  the  largest  £  for 
that  cluster  column.  In  the  above  matrix,  three  clusters  from  each  sample  meet  this 
criterion,  as  indicated  by  the  underlined  coefficients  (with  decimals  omitted),  correspond¬ 
ing  to  Sample  A  clusters,  Al,  A2,  A3,  and  Sample  B  clusters,  Bl,  B2,  and  B4.  Clusters 
from  each  sample  with  underlined  coefficients  in  the  same  row  were  matched,  as  is  the 
case  for  clusters  Al  and  B2,  A2  and  Bl,  and  A3  and  B4.  Thus,  each  cluster  in  each  of 
these  pairs  is  a  "counterpart"  of  the  corresponding  total  sample  cluster.  The  columns  and 
rows  (i.e.,  clusters)  that  contained  an  underlined  coefficient  were  deleted  from  the 
respective  Sample  A  or  B  matrix  half,  as  can  be  illustrated  by  drawing  lines  through  these 
rows  and  columns  for  both  samples.  Thus,  the  only  remaining  entries  in  the  above  matrix 
are  the  coefficients  in  parenthesis,  45  and  70,  under  clusters  A4  and  B3  respectively. 


Step  3 

Step  2  was  reiterated  for  the  remaining  cluster  entries  in  the  matrix.  A  cluster  was 
not  identified,  however,  as  matching  if  it  obtained  an  £  which  was  more  than  10 
correlation  points  smaller  (an  arbitrary  criterion)  than  the  largest  £  for  a  row  in  the 
complete  matrix.  Thus,  in  the  above  matrix,  cluster  B3  is  identified  for  matching  since  it 
obtained  a  coefficient  of  70.  Cluster  A4,  which  obtained  a  coefficient  of  45,  is  not 
identified  for  matching  since  45  is  more  than  10  points  smaller  than  90,  the  largest 
coefficient  in  that  row.  This  criterion  was  used  to  avoid  matching  clusters  that  were  not 
closely  related  to  a  total  sample  cluster. 


Step  4 

If  steps  2  and  3  did  not  result  in  a  unique  pair  of  matched  clusters  identified  for  a 
particular  total  sample  cluster,  an  independent  sample  cluster  was  allowed  to  be  matched 
a  second  time  if  both  of  the  following  criteria  were  met: 

1.  The  sample  cluster  obtained  the  largest  (or  within  10  points  of  the  largest)  row£ 
for  a  particular  total  sample  cluster. 

2.  There  was  a  large  correlation  between  the  MP  profiles  of  the  two  total  sample 
clusters  (demonstrated  by  an  £  equal  to  or  greater  than  an  arbitrarily  selected  value  of 
.80-- the  correlational  model  used  to  calculate  this  r  was  the  same  model  used  to  calculate 
the  matrix  coefficients  as  described  in  Step  L  above). 


Step  5 

If  steps  1  through  4  did  not  result  in  a  pair  of  matched  clsuters  for  every  total  sample 
cluster,  then  additional  clusters  were  selected  by  a  second  search  of  the  independent 
sample  cluster  hierarchies  according  to  the  following  criteria: 

1.  Substantial  common  membership  with  total  sample  clusters  for  which  there  was 
no  corresponding  matched  pair  (determined  by  examining  case  IDs). 


C-2 


2.  Overlap  Between  index  value  no  lower  than  35  percent. 


These  additionally  selected  clusters  were  thus  matched  by  their  correspondence  to  the 
same  total  sample  cluster.  Pearson  correlation  coefficients  were  obtained  between  these 
matched  clusters  for  the  AD1000,  ET1000,  YN1000,  and  TM368  paired  samples  with 
corresponding  total  sample  clusters  according  to  the  model  described  in  Step  1;  that  is, 
and  values  were  calculated.  Extensive  programming  requirements  prohibited  the 

calculation  of  the  and  r^g  values  for  additionally  selected  clusters  for  all  samples, 

although  such  Additional  Clusters  (ACs)  were  identified  for  each  sample  when  necessary. 
Also,  a  count  was  made,  for  each  sample,  of  the  number  of  matched  pairs  of  clusters  that 
consisted  of  one  or  two  Additional  Clusters  (labeled  as  N  of  Cluster  Pairs— 2nd  search). 

Common  Membership  in  Matched  Clusters 

Rationale 


The  derivation  of  the  Percent  of  Common  Membership  index  was  based  on  a  design 
idea  by  Orr  (1960).  This  index  specifies  the  degree  to  which  the  same  personnel  from  a 
holdout  sample  were  assigned  to  each  cluster  in  a  matched  pair  of  clusters.  When  Percent 
of  Common  Membership  values  (i.e.,  percentages)  are  averaged  over  all  matched  clusters 
for  any  two  paired  samples  (i.e.,  sample  A  and  sample  B),  the  result  indicates  the  degree 
to  which  a  similar  pattern  or  cluster  solution  was  obtained  across  samples— the  higher  the 
average  percentage  value,  the  higher  the  cluster  solution  stability. 

Assignment  of  Holdout  Group  of  Individuals 

Any  set  of  matched  clusters  consists  of  a  set  of  sample  A  clusters  and  a  set  of  sample 
B  clusters  (see  the  Cluster  Matching  Procedure  section  of  this  appendix).  In  the 
derivation  of  the  Percent  of  Common  Membership  index,  individuals  from  a  holdout  group 
were  assigned,  separately,  to  sample  A  clusters  and  to  sample  B  clusters.  The  following 
two  methods  of  assignment  were  used,  each  based  on  a  different  measure  of  profile 
similarity: 

1.  Percent  of  Common  Membership— Time- Spent  Method.  Assignment  of  in¬ 
dividuals  was  determined  by  the  value  of  the  sum  of  absolute  difference  (i.e.,  distance) 
between  percentages  on  corresponding  tasks  of  the  Average  Percent  Time-Spent  by  All 
Members  (TSM)  profile  for  clusters  with  the  individual's  Relative  Time-Spent  percentages. 
Assignment  was  made  to  the  cluster  with  the  smallest  distance  value. 

2.  Percent  of  Common  Membership— Task-Performed  Method.  Cluster  assignment 
was  determined  by  the  largest  point-biserial  correlation  between  the  individual's  Task- 
Performed  scores  (i.e.,  0  for  task  not  performed,  and  I  for  task  performed)  and  the 
Percent  of  Members  Performing  (MP)  profile  for  clusters.  This  correlational  model 
treated  tasks  as  cases,  and  scores  of  zero  on  corresponding  tasks  were  included. 

Calculation  of  Index 


The  Percent  of  Common  Membership,  based  on  either  assignment  method,  equaled 
twice  the  number  of  personnel  assigned  in  common  to  each  pair  of  matched  clusters, 
divided  by  the  total  number  of  personnel  assigned  to  each  pair,  and  multiplied  by  100.  For 
example,  the  sample  A  and  sample  B  cluster  for  one  matched  pair  are 


assigned  65  and  85  personnel  respectively,  and  65  are  assigned  in  common.  Thus,  the 
Percent  of  Common  Membership  =  2  x  65  =  .866  x  100  =  86.6%.  A  maximum  stability 

65  +  85 

value  for  this  index  occurs  if  both  clusters  are  assigned  only  the  same  personnel  (e.g.,  if 
both  clusters  are  assigned  the  same  65  personnel,  then  index  =  2  x  65=  1.0  x  100  =  100%). 

65  +  65 

Finally,  it  should  be  noted  that  this  index  was  subject  to  two  sources  of  attenuation 
that  the  correlational  indices  were  not— error  due  to  individual  data  (versus  mean  data) 
being  analyzed,  and  attenuation  due  to  dependence  between  matching  clusters  within  each 
paired  sample.  In  regard  to  the  latter,  the  higher  the  dependence  (i.e.,  correlation 
between  within-sample  cluster  profiles),  the  more  probable  it  was  that  low  index  values 
would  be  obtained.  To  illustrate  this  point,  consider  that  one  sample  A  cluster  in  a  pair  of 
matched  clusters  is  highly  correlated  with  another  sample  .A  cluster.  Therefore,  holdout 
personnel  with  similar  job  description  profiles  will  tend  to  be  split  (in  assignment)  between 
these  two  highly  correlated  sample  A  clusters,  but  assigned  as  a  group  to  only  one  sample 
B  cluster.  Thus,  in  this  case,  the  percentage  o!e  common  membership  between  the 
matched  clusters  would  be  attenuated. 
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Table  E-2  (Continued) 


APPENDIX  F 

STABILITY  RESULTS  OF  AVERAGE  TASK-PERFORMED 
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APPENDIX  G 


CLUSTER  STABILITY  RESULTS  FOR  RATING  SAMPLE 
SIZE  OF  1000 
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,82 

9 1: 

86Z 

S  Tasks 

159 

|ft8 

1)2 

N  Members 

15 

22 

11 

62/67 

18/44 

Cluster  Id. 

15 

15 

12 

Index 

.99 

.97 

.91 

721 

bit 

N  Tasks 

178 

212 

217 

X  Members 

b4 

31 

47 

4? '58 

16/44 

Cluster  Id. 

1* 

17 

15 

Index 

.99 

.97 

.9*1 

91 1 

95Z 

M  Tasks 

291 

277 

2*5 

X  ^embers 

A5 

39 

79 

57/ 4  7 

56 '3« 

Index  Aver 

.95 

.97 

.85 

i*ver  X  Tasks 

2  72 

271 

7b8 

Tot  X  Members 

1  lb  7 

711 

hb4 

IT  ol  sample 

hH* 

7 1T 

66? 

[mrw»  Me*iher*bjp 

81* 

’6? 

l>»f  i-.i 1 1  h|. ii  i i<iih  ol  --TA,  -tit,  m,1  -AH  .ire  desrrlhed  ,m  pane  7 


Percent  ‘.ium*nn  Membership  value*  e«iu*l  tvlio  the  number  of  personnel  assigned  in  row«  m  each  cluster  .-I  a  pair 
.if  matched  cluster-.  (numerator ),  .11 W  Jed  hv  /he  mt-il  number  ol  personnel  asstencd  *0  rh*  pair  (denominator),  multiplied 
bv  till).  The  nuwrabir  »od  .lenoain.it or  appear  under  each  percentage.  (  lonter*  labeled  At' I  and  M  l  were  treated  aa  one 
cluster  in  the  calculation  of  the  Percent  C.iMaon  Membership  values.  1  Se*  pages  7  and  C-)  for  lurtlier  deactiptlon  ol 
indea) . 


AC  la  an  additional  cluster  selected  < rn*  a  second  search  of  rhe  ««■*>!« 


1 1 « f  1  on .  (See  aleo  pager  C-2  andC-)). 


Table  C-2 


Stability  and  Common  Hemhershlp  of  Hatched  Clusters 
Across  YN1000  Samples  (SOX  of  Total  Sample) 


Table  G-3 


Stability  of  Hatched  Clusters 
Across  ET1000  Samples  (50X  of  Total  Sample) 


Correlational  Index  of  Stability  for  Samples*  | 

Total 

A  vs.  Total 

B  va.  Total 

A  vs.  B 

(A+B) 

I 

<5"*A 

Itb  1 

^AB 

Cluster  Id. 

1 

1 

1 

Index 

.96 

.98 

.96 

N  Tasks 

566 

472 

563 

N  Members 

255 

154 

138 

Cluster  Id. 

2 

8 

6 

Index 

.91 

.89 

.85 

M  Tasks 

412 

411 

245 

N  Members 

75 

14 

12 

Cluster  Id. 

3 

AC1» 

3 

Index 

.98 

.97 

.94 

H  Tasks 

428 

389 

427 

N  Members 

61 

43 

27 

Cluster  Id. 

4 

5 

2 

Index 

.99 

.99 

.97 

N  Tasks 

432 

430 

444 

N  Members 

173 

103 

94 

Cluster  Id. 

5 

AC2b 

AC3b 

Index 

.94 

.90 

.78 

N  Tasks 

422 

422 

392 

N  Members 

33 

13 

9 

Cluster  Id. 

6 

6 

5 

Index 

.69 

.85 

.69 

N  Tasks 

398 

324 

347 

N  Members 

21 

49 

12 

Cluster  Id. 

7 

3 

8C 

Index 

.88 

.84 

.87 

N  Tasks 

362 

504 

507 

N  Members 

23 

22 

106 

Cluster  Id. 

8 

4 

7 

Index 

.97 

.99 

.95 

N  Tasks 

578 

572 

580 

N  Members 

418 

131 

198 

Cluster  Id. 

9 

2 

c 

8 

Index 

.99 

.99 

.97 

N  Tasks 

518 

516 

528 

N  Members 

202 

83 

106 

Cluster  Id. 

10 

9 

10 

! 

Index 

.94 

.98 

.88 

| 

N  Tasks 

501 

489 

512 

1 

N  Members 

88 

22 

41 

1 

Cluster  Id. 

11 

7 

4 

Index 

.91 

.87 

.79 

M  Tasks 

355 

334 

363 

N  Members 

33 

22 

13 

Cluster  Id. 

12 

13 

12 

Index 

.89 

.95 

.79 

N  Tasks 

215 

206 

171 

N  Members 

34 

12 

15 

Cluster  Id. 

n 

15 

13 

Index 

.86 

.98 

.87 

N  Tasks 

322 

320 

305 

N  Members 

57 

22 

13 

Index  Aver 

.92 

.94 

.87 

Aver  N  Tasks 

424 

415 

415 

Tot  M  '(embers 

1473 

690 

824 

T  of  Sample 

74? 

69? 

83X 

a  r  r  r 

1‘rncedures  for  calculations  of  -TA,  -T8,  and  — AB  are  described  on  pane  7, 


AC  Is  an  additional  r luster  selected  from  a  second  sesrch  of  the  sample  solution. 
(See  also  pane  C-2  sod  C-3). 

The  same  sample  cluster  was  allowed  to  be  matched  twice  if  certain  criteria  were 
met  (See  step  4,  pager-21. 


Table  (2-4 


Stability  of  Hatched  Clusters 
Across  TM368  Samples  (SOZ  of  Total  Sample) 


Correlational  Index 

of  Stability  for  Samples*1 

■■  ■ 

■ 

mr~  \ 

Cluster  Id. 

1 

1 

1 

mg 

Index 

.94 

.99 

.96 

N  Tasks 

319 

283 

288 

H  Members 

144 

42 

74 

Cluster  Id. 

2 

3 

7 

Index 

.90 

.72 

.51 

H  Tasks 

160 

161 

146 

N  Members 

16 

7 

5 

Cluster  Id. 

3 

b 

2 

Index 

.95 

b 

N  Tasks 

191 

H  Members 

11 

8 

Cluster  Id. 

4 

ac:1’ 

AC2C 

Index 

.96 

.81 

.70 

N  Tasks 

337 

302 

269 

N  Members 

39 

25 

6 

Cluster  Id. 

5 

5 

10 

Index 

.99 

.99 

.97 

N  Tasks 

305 

305 

298 

N  Members 

142 

57 

70 

Cluster  Id. 

6 

6 

11 

Index 

.99 

.97 

.94 

N  Tasks 

237 

245 

252 

N  Members 

53 

31 

25 

Cluster  Id. 

7 

8“ 

8 

Index 

.97 

.96 

.89 

N  Tasks 

196 

189 

192 

S  Members 

34 

21 

14 

Cluster  Id. 

8 

8d 

9 

Index 

.79 

.88 

.64 

N  Tasks 

202 

186 

199 

N  Members 

11 

21 

8 

Cluster  Id. 

9 

4 

3 

Index 

.97 

.95 

.86 

M  Tasks 

169 

173 

166 

N  Members 

27 

15 

11 

Cluster  Id. 

10 

2 

6 

.63  j 

Index 

.75 

.69 

N  Tasks 

234 

135 

223 

N  Members 

12 

32 

5 

Cluster  Id. 

11 

9 

16 

Index 

.93 

.94 

.80 

N  Tasks 

150 

146 

140 

N  Members 

15 

8 

8 

Cluster  Id. 

12 

10 

13 

Index 

.98 

.94 

.74 

M  Tasks 

246 

286 

337 

N  Members 

34 

20 

16 

Index  Aver 

.92 

.90 

.79 

Aver  H  Tasks 

232 

217 

226 

Tot  N  Members 

538 

279 

250 

Z  of  Sample 

73Z 

76X 

68Z 

^Procedures  for  calculations  of  -TA,  and  -AB  are  described  on  page  7. 


^No  sample  B  cluster  could  be  found  that  met  selection  criteria.  (See  page  6  and 
page  C-2  for  selection  criteria). 

AC  Is  an  additional  «  luster  selected  l  rom  a  second  search  of  the  sample  solution. 
(See  also  page  C- 1) . 

^The  same  sample  cluster  wss  allowed  to  he  matched  twice  If  certsln  criteria  were  met 
(See  step  a.  page  C-i) . 
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